If you are a student who is trying to improve the SAT score, you may want to have more practice questions. However, question of the day only provides limited number of questions for you, say, for example, a month. However, nothing can stop a geek! Let’s use Python and MongoDB to solve this problems.
My initial step to steal those question is to find the unofficial API:
When you select a date in the website, the URL will change from
sat-question-of-the-day?questionId=<span style="text-decoration: underline;">20151113</span>&tq=1
sat-question-of-the-day?questionId=<span style="text-decoration: underline;">20151112</span>&oq=1
So I write down:
The above program uses multi-threading because crawler’s speed limit is usually on IO rather than Computation. NoSQL database MongoDB is used to store the data due to its easy model and schema-less.
Requests and Beautifulsoup are the two main module here.
Feel free to fork my code!