mysql - How to: log and anlyze clicks, pageviews and sessions to optimize conversion -
we have medium size e-commerce site. sell books. on said site have promotions, user recommendations, regular book pages, related books, etcetera. quite similar amazon.com except ofcourse volume of site.
we have traditional lamp setup, m still stands mariadb.
tptb want log , analyze user behaviour in order optimize conversion.
bottom line, each click has logged, think. (i fear)
this add few million clicks every month. system has able go in time @ least 3 years.
questions might asked system are: given page (eg: homepage), , clicks on promotional banner, color of said banner gives best conversion. split question new , returning customers. (multi-dimensional or a/b-testing) or, given view of book , b, books users buy next. range of queries going wide. aggregating data pointless.
i have serious doubts mysql's ability provide platform storing, analyzing , querying data. store rows, feeding them mysql via rabbitmq avoid delays, query , analyze data efficiently might not optimal in mysql, given 50m rows.
there have been number of articles using mongodb store analytical data. posts seem increment counter in document (pre-aggregating data), not enough us.
the big question is: there database (or other system) particularly well-suited store , analyze data this? might mysql still trick? correct in assessment mongodb might not of added value here?
if understand correctly, want have reports aggregated data done once day (as opposed "live")? if that's case, suggest use hadoop, allows run massive map/reduce jobs running aggregations you, , present report. @ amount of data, "live" solution not work.
if don't want mess complexity of hadoop , map/reduce, perhaps mongodb might work. has quite powerful aggregation framework can tasked many aggregations in sort-of-live environment. it's not meant running @ every pageview, it's not "let's once day" kinda thing. depends little bit on data aggregation requirements whether aggregation framework can you, if doesn't, mongodb supports map/reduce more complex tasks (at slower pace). mongodb quite fit, can have large write performance - if 1 node doesn't work, can shard have higher write performance.
Comments
Post a Comment