In July 2016, Magento acquired RJMetrics, a business intelligence platform designed primarily for e-commerce analytics. This product has now been branded as Magento Business Intelligence, offering a full stack BI product to Magento merchants.
Having worked as an analyst with RJMetrics and then Magento for a combined 3 years in the city of brotherly love, I’ve worked with over 200 small to medium sized e-commerce customers, advising them on best practices in using the product, building reports, as well as maintaining an optimized data warehouse. I now work independently, consulting with merchants to make the most of their reporting and analytics efforts. It only took me 3 years to realize that there should be content about the most common questions that customers asked about (and sometimes dreaded) when using Magento Business Intelligence. Read on!
What’s a replication method, and which one should we use?
For a Magento database, most tables have an auto-incrementing numeric primary key column called “entity_id”. Hence, using the “Auto-Incrementing Primary Key” replication method is the recommended option for most tables. For some other tables that have a combination of two or more columns as the primary key, like “catalog_category_product”, the “Primary Key Batch Monitoring” option is best. That said, if the “updated_at” column on the database tables is set up correctly - i.e. it is a timestamp that is automatically updated when a row is updated - the “Modified At” replication method is the best option.
Is there a difference in the direction of setting up paths - one-to-many or many-to-one?
Yes. When setting up paths between tables, think about the two tables you are connecting. For example, let’s say you are connecting the “orders” and “line items” tables. Ask yourself, “Does a single order have multiple line items, and does a line item belong to multiple orders?” In most cases, the answer to the first question is “yes”, and the second is “no”. Hence, in the case of these tables, the “orders” table is the “one” side of the relationship while the “line items” table is the “many” side.
Consequently, when bringing a column from the “orders” table to the “line items” table, it can be brought over as a “Joined Column”. All information about an order is also valid for all its associated line items. On the other hand, a column cannot be brought from “line items” to “orders” using a “Joined Column” because an order could have more than one distinct associated line item. So, when bringing data from “line items” to “orders”, an aggregation function - such as sum, average, max, min etc. - needs to be performed.
The concept is illustrated visually below.
In the above example, let’s consider two scenarios -
Which table should metrics be built on?
The short answer is that metrics should be built on the lowest “grain”. “grain” refers to the granularity of data in the table. The “orders” table contains information about each order - its total amount, date, customer who purchased the order. However, it does not contain information about the line items in the order. This information lives in the “line items” table. Hence, “line items” is at a lower granularity than the “orders” table i.e. it contains all information that the “orders” table contains and then some additional information.
Hence, everything that can be built on the “orders” table can also be built on the “line items” table. Further, the “line items” table allows you to segment your data to a greater degree than the “orders” table due to the presence of more granular data in it.
How should many-to-many relationships be analyzed?
Magento BI does not have an inbuilt way to handle many-to-many relationships. To analyze data from tables sharing a many-to-many relationship, the recommended method is to use SQL - either in the SQL Report Builder or in a Data Warehouse View. The two tables will need to be joined, with each joined combination becoming a new row. For example, let’s consider the case of each product being in multiple categories, and each category having multiple products. If there were a total of 4 products, each belonging to 3 different categories, the total number of combinations would be 4X3 = 12 rows. Once this join has been performed, either in the SQL Report Builder or as a Persistent View, information about these combinations can be added to these reports/views.
These are some of the most common questions customers of Magento BI have about how to best use the product. With MBI evolving quickly, I’m sure there will be many more questions that haven’t been covered above. Have other questions about best practices or looking for advice? Leave a comment or reach out here.