17 September 2009
Say you have a table of names that you want to alphabetize. Of course you could SELECT and ORDER BY name ASC. Lets assume that’s not an option and you want to save their alphabetical order in a special column so then you can ORDER BY this column. This might be useful if you have an application where users can manually order their items but you give them them option to reset to alphabetical. Here’s what your table looks like:
people
- id (int)
- name (varchar)
- sort (int)
Here’s how you can populate that sort column with integers that, when ordered by, put the results in alphabetical order:
SET @i=0;
UPDATE people
INNER JOIN (
SELECT id, @i:=@i+1 AS i FROM people ORDER BY name ASC
) counter ON counter.id = people.id
SET people.sort = counter.i;
The SET command initializes a variable. The subquery orders the list and adds a new column (i) that counts off each row effectively. Then the outer query joins on this by the PRIMARY id so that we match the ordered row from the subquery with the outer query. And lastly, we store the i value back in the table.
16 September 2009
I’ve never understood (well, I’ve never looked into) why MySQL defaults to that swedish collation. If your tables are a mix of collations, here is a nice php script that will convert all the tables to the collation of your choosing. It even has a nice GUI you use in the web browser. Download the script here (courtesy of phoca.cz).
30 December 2008
This is an easy thing but I never took the time to figure it out because I use phpMyAdmin so much. Exporting a database is easy with phpMyAdmin, but importing if it is large can be hard because a lot of times your connection will time out or PHP is restricting how big you can upload or whatever. If you can SSH, this is easier:
mysql -u USER -p DBNAME < dump.sql
If you are on Media Temple’s DV server, like us, you can do this:
my DBNAME < dump.sql
So much easier!
14 March 2008
I’m working on a project where people have profile pages and we need to show a counter for how many views they’ve gotten. I saw two ways of doing this:
- Create an MySIAM table and everytime there is a view, insert a new row to represent the view. This could be nice because adding a timestamp would let you track views over time. Or stuff like IP address. To see the total, do COUNT(*) across a given user_id.
- Create an InnoDB table with a row for each user and a column that represents the count. On each view, do an update and increment the count. Then read that number out as the total. This seemed good because I wouldn’t have a single table with a HUGE amount of rows.
Google didn’t help me find any direction so I made a test. I created a table for each. In the case of the MySIAM I filled it with 10 million test records. Then I echoed out the total.
The result I found was that they basically take the same miniscule amount of time. Because I indexed the MySIAM on the user_id it optimizes really well and doesn’t seem to care that there are only 1000 views per user amongst the 10 million. Only how many views a user has. So a user with 10000 views takes longer to total than one with 100. I’m going to go with the MySIAM since it is storing richer data that may be useful down the road. Figuring I can switch to a InnoDB implementation later if I need to.
28 February 2008
Here’s another interesting query. Say you want to get a list of items, ordered by something, lets say date. But you want a particular row at the top, it’s special. Well check this out:
SELECT id
FROM games
ORDER BY id=5 DESC, dcreate DESC
“id=5″ converts to a 1 if true, 0 if false on all the rows that are returned. By ordering that boolean by DESC, you put the 1 at the top. The rest are all 0. Thus, you have your special row with id=5 at the top and the rest ordered by date! I’m guessing this doesn’t optimize super well, I guess we’ll see.
27 February 2008
Check out this query:
SELECT
games.title AS game,
GROUP_CONCAT(DISTINCT artists.name ORDER BY artists.name ASC SEPARATOR ', ') AS artists
FROM games
INNER JOIN tracks ON tracks.game_id = games.id
INNER JOIN artists ON artists.id = tracks.artist_id
GROUP BY games.id
I want to get a list of the soundtracks of all the games. The list should show me the name of the game and all of the artists that contributed to the soundtrack. I didn’t want to do two queries to select first the games and then the artists. This GROUP_CONCAT function saves the day. When you’re doing a grouping you can use it to join the columns that were collapsed with some separator, like they were an array! Nice!
6 December 2007
Here’s some more mysql odds and ends I’ve been picking up:
- MySIAM and no deletes: If you never delete rows from a mysiam table, mysql won’t lock the whole table when you do inserts. In other words, people can run selects on the table while an insert is happening. I’m build this application right now with a column name is_deleted that I set to “1″ where I would have deleted the row. Then all my SELECTS have to have a “is_deleted=’0′” in the WHERE. I wonder if complicating my WHERE actually adds more time than not deleting saves…
- Joins: According to the High Performance MySQL book I’m reading, choosing the order to join table is “one of MySQL’s weakest skills.” Figuring out best to do the join can take longer than running the SELECT. Thus, I’m taking from this: keep JOINs to a minimum.
- SQL_CALC_FOUND_ROWS: Apparently using FOUND_ROWS() can be slower than running a second, non LIMIT-ed SELECT when you are trying to find how many total rows match your query.
14 November 2007
I installed Leopard on my work comp last night. Because I was scarred of php/mysql config from Mark’s post I ended up using Mamp. Just learned of it this morning and installed it, but it’s working pretty nice. One useful thing is it makes it really easy to change your apache directory, so you could test sites out as the root of the webserver without any host file / vhost shenanigans. Also, you can quickly switch between php 4 and 5. I wish it installed as a system pref instead of an app, but still, pretty rad.
13 November 2007
I’m boning up on architecture design while I get ready for a big community site we’re making. I learned some new things about MySQL this morning (reading Ben’s “High Performance MySQL” book I thought would be useful to share:
- MySQL will only ever use one index per table per query. Therefore if you have a query that has multiple columns it’s matching in it’s where statement, it only matters that one of them is indexed. And, if I’m correct, you’re better off having the column with the most diversity indexed.
- This sort of follows from the previous, but this where multi column indexes come in handy. To use the book’s example, if you are always SELECTing based on last name and first name, if they are each separate indexes only one will be used. Thus, to fully optimize that SELECT, you’d want to have a multicolumn index over both of them
- If you’re using a UNIQUE column just as a way to not allow duplicate entries in a table, it adds overhead. You may be better of checking for duplicates in your application before INSERTing. This may be an occasion to use InnoDB’s transactions too.