Sunday, February 22, 2009

High Performance Database Tips

Some Database mantras discovered on the basis of practical experience.
  • Use the smallest datatype possible. eg. where a integer can take care of all possible states, don't use bigint. Every bit of extra space alloted will make the DB to grow in size without any actual utility, and there by taking toll on the performance
  • Choose what to index and how much to index. Indexing is creating a meta data of data. If data is being fetched on the basis of a particular attribute, make that attribute an index. It would will definitely boost-up the reads and updates. Also, keep in mind that indexing increases the insert time and these meta-data information do occupy disk space. Partial Indexing is a good way to create indexes not the on the complete field, but the deciding part of it. eg. Index can be created for the first few character's URLs if data to be fetched is on the basis of host name.
  • Choose storage engine wisely. eg. If the DB is not relational, go for MyISAM. Read the pros-cons of each engine before choosing one.
  • Controlled Data Redundancy. Higher the normalisation of a DB, lesser is the redundancy. But, one think to keep in mind is that retrieving data from such a DB might require joining of tables, which is a very expensive operation. If the size of DB is huge or if the data to be fetched is distributed across a significant no. of tables, then performance will take the hit. One has to make a educated choice between space and performance. My experience tells me that a bit of redudancy isn't that bad !
  • Fetch only relevant data. If you require some fields of a table(s), retrieve only those fields. Don't do a 'select *'.
  • Better programming logic: When programitically accessing DB use reuse connections. eg. If on the same DB schema, we need to fire 2 quries to retrive logical/related set of data, try using the same connection. Creating connection is costly. Also, Make sure to close all connections when use is over.
  • Don't run all servers on the same node. If application receives a huge no. of hits, and also tremendous DB operations are taking place in the DB (like huge no. of inserts/read/writes), its better to host the application on a node diffrerent from the DB. There is only so much a single node can handle.

No comments: