PHP North West 2017
At the end of September 2017 one of our Drupal development team, Alejandro attended the 10th and (what sadly turned out to be the last for the foreseeable future) PHP North West Conference. The event combined a full day of active learning and lesson based sessions on a range of subjects, followed by a 1.5 day conference with 3 tracks of talks and best practice knowledge for over 500 delegates. Here are a few of Alejandro's tips and highlights.
Big Data, but fast
Users demand all the benefits of querying huge data sets, but increasingly expect performance measured in milliseconds. Imagine having a huge table with data covering 20 years of sports statistics for example. The database queries needed to render the content could be very slow and have the user waiting for a few seconds which is too long… Enter Hadoop. This SQL query engine allows you to divide a huge query into smaller ones according to the data range requested so that the server is not executing a query over a 20 million register dataset every time a query is made. This way, queries are faster and the end user wait time is cut by a factor of 4x or 5x (wait time can reduce from seconds/minutes to milliseconds depending on the query).
To apply this technique, a good tracking system such as new relic is needed to be able to have reports on the way users query data from the server: this is necessary to configure the buckets in hadoop. It does not do it automatically because the buckets depend on the nature of the data and the way users query it, and this varies for each business/website. So, Hadoop can certainly help but if badly configured it can potentially cause queries to be slower than before, hence the need to understand how users query your data before choosing Hadoop. You can learn more about configuration considerations here http://hadoop.apache.org/docs/current/