Surprising results
Jun. 25th, 2010 11:43 amI'm writing a little web-based word game, and put together a Python script to compute scores, which involves scanning a wordlist.
Scanning the list by opening a text file and iterating through: 7 seconds
Importing the list into a SQLite DB and using SELECT LIKE: 15 seconds
Computing the set of substrings in Python and using SELECT IN: 19 seconds
If you'd asked me going in, I would have expected these to be in the other order. Of course, there are some other tricks I can start using with indexing the SQL tables and optimizing my queries that will help.
Scanning the list by opening a text file and iterating through: 7 seconds
Importing the list into a SQLite DB and using SELECT LIKE: 15 seconds
Computing the set of substrings in Python and using SELECT IN: 19 seconds
If you'd asked me going in, I would have expected these to be in the other order. Of course, there are some other tricks I can start using with indexing the SQL tables and optimizing my queries that will help.
Results are not surprising
Date: 2010-06-25 03:51 pm (UTC)I'm not sure exactly what you mean by "Computing the set of substrings in Python and using SELECT IN," so I can't comment on its performance.
If you want this to be sped up by the DB, you're going to have to build a more complex data model that does all the heavy lifting in advance, so you end up with queries that can benefit from indices.
That's, e.g., what password crackers for old-style UNIX crypt()ed passwords do -- they've essentially calculated huge numbers of encrypted password strings in advance, so cracking a password often requires no more than a single indexed table lookup.
Re: Results are not surprising
Date: 2010-06-25 05:06 pm (UTC)(In the particular game I'm building, your score is based on the longest words that can be found in the grid you've filled in. So if along one particular path you have the letters SWREN, you'd get 4 points. That's what I mean by having Python precompute the substrings: I mean
[s[j:k] for j in range(0,4) for k in range(j+1, 5)]