Database optimization is something, that while may go unnoticed, is very important a very important for saving time and CPU usage. In attempting queries on the math database downloaded form stackexchange. A query is an expression written in a programming language, in this case SQL, that is used for data look up in a database. Stackexchange is a large compressed file that contains a record of all the forums from the website I specifically used the math files since they are one of the largest files in size and serve as a perfect example for why database optimization, and finding efficient SQL queries is important. I have noticed that complicated queries can be optimized to make for faster, more complete results. With the size of the files that I was using to create my database each being any where from 30 megabytes to 600 megabytes, which inserted a few hundred thousand lines of data into my tables, even simple queries, such as “SELECT * FROM posts”, that normally take less than a second to run were taking as long as 10 or 15 seconds. From experimentation and research, the best way speed up queries are, knowing you database, being specific, and using techniques, such as indexes.
Matthias Jarke and Jurgen Koch's paper Query Optimization in Database Systems also displayes the continued importance of finding way to make efficient queries when using a database system. “Efficient methods of processinf unanticipated queries are a crucial prerequisite for the success of generalized database management systems.”( Query Optimization in Database Systems) The paper also explains the importance of knowing the database and what you are searching for. Also explained is the importance of finding alternate ways to make a query, such has eliminating unnecessary look ups. Also explained in the paper is reasons why fast, efficient queries are so important. A big reason for this is due the “communication cost”, “storage cost”, and “computation cost”. The communication cost is the time it takes to transmit the data from site to site.(Query Optimization in Database Systems) The storage cost is the “cost of occupying secondary memory storage and buffers over time.”( Query Optimization in Database Systems) Finnaly the computation cost is the time that the central processing unit is being used. ( Query Optimization in Database Systems) To sum it up query optimization is important so that these costs can be reduced as much as possible even as the amount of data increases.
Using the schema: users id integer, name text, rep, integer, about text) posts( id integer, title text, owner_id text, post_id, created text) com(comment_id integer, post_id integer, comment_text text, user_id integer), I ran a query “SELECT COUNT(post_id), created FROM posts GROUP BY created” to get a count of how many posts were made and on what days. This took anywhere from five to ten seconds to run which is a pretty good amount of time. The problem with the query is that it is...

