top of page
Search

Generative AI: Agile Data Warehouse Design with BEAM

In this blog post and the next, we'll explain what BEAM is and when you should use it. We'll take you through every step. You can also run the steps yourself with the help of Qlerify and Chat GPT. See how in the full video at the end of this blog post!


A robot looking at a data warehouse.


WHAT IS BEAM?

BEAM is a workshop method for modelling data warehouse/business intelligence (DW/BI) solutions. BEAM stands for Business Event Analysis & Modelling and is described in the book Agile Data Warehouse Design (Collaborative Dimensional Modeling from Whiteboard to Star Schema) by Lawrence Corr. It's a book we highly recommend if you are working BI but also if you are a Business Analyst outside DW/BI. We believe BEAM can be very beneficial outside pure DW/BI projects. For example to structure data in complex software applications, making it easier to understand and manage the data.


BEAM is a method that is neither data driven nor report driven but business driven, it:

  • uses collaborative modelling directly with business stakeholders.

  • discovers business events by first asking the business: "Who is doing what?"

  • uses the 7Ws to capture example data that is easily understood by business stake holders.

  • uses dimensional modelling.

A BEAM session is usually recorded on a a whiteboard or a spread sheet. Now it can also be done with Qlerify. We recently launched BEAM powered by Chat GPT! The info boxes ℹ️ below show how every step is done in Qlerify.


Collaborative modelling

In collaborative modelling we involve the future users of the product in workshops from the start. We want them engaged already in the design process. This will increase the chances of success in the project and also reduce the training need of end users.


Business events

Business events are things that happen during daily business operations. Typically triggered by people performing actions. For example a customer that buys a product, uses a service or a supplier that delivers a component. But it can also be triggered by a system (weather stations records temperature) or by time (end of the month). From BI perspective, relevant business events are things that happen that we want to record and be able to measure and evaluate.


7Ws

The 7Ws are often cited as the checklist used by journalist to cover the ‘full’ story. Each W is a questions. The 7W are useful for modelling data warehouses because it’s the same questions that the end users will ask when querying the data.

  • Who is involved? (People, Organisation, Employee, Customer)

  • What did they do? To what? (Things, Product, Service)

  • When did it happen? (Date, Time of Day)

  • Where did it take place? (Location, Store, Hospital, Delivery Address)

  • Why did it happen? (Reason, Causality, Promotion, Weather)

  • How did it happen? In what manner? (Transaction ID, Status code, Order ID, Call Status)

  • How many or how much? (Measures, KPIs, Sales revenue, Quantity)

Dimensional modelling

An effective way to model a DW/BI solution is Dimensional Modelling as opposed to for example Entity-Relationship (ER) modelling.


A dimensional model can be visualised as a star schema. It has a fact table at the center surrounded by multiple dimension tables. The fact table holds the business events, for example Customer Orders. The dimension tables hold the meta data that describe the business event, for example: customer info, product info etc. We use the dimensions to create meaningful reports, to filter, group, and aggregate the information in the fact table.


Below is an example of a fact table at the center containing customer orders. It's surrounded by 4 dimension tables with meta data.


A star schema with a fact table in the middle surrounded by 4 dimension tables.

🔖 Exercise: Are the dimensions covering your reporting needs in your current business (if you take orders of any kind).


BENEFITS

BEAM is a collaborative process first approach for data modelling. It makes it easy and natural to engage with stake holders. It’s easy to understand for everyone involved. It shows which participants that know the business inside out. It's a massive learning experience for those who don't. It can be used at any time, before or after the actual operational system is in place. It works perfect for agile teams because we gather the requirements in small chunks suitable for iterative implementation, one star scheme at the time.


HOW DO TO RUN A BEAM WORKSHOP WITH SUPPORT OF AI

1. Discover an Event: Ask “Who Does What?”

As a BEAM workshop facilitator you start by asking the simple question (the first 2Ws):

Who does what?

The answer should typically be a subject-verb-object combination, for example:

Customer orders product (subject-verb-object)

This is called the main clause of the event. Someone is performing an action on something. We want to discover events that are relevant for the business, if the answers are a bit off you might want to rephrase the question to:

Who does what that we want to report on within the scope of the next iteration?

Stakeholders typically have several events they need to measure. For instance: Customer orders products, Warehouse ships product, Carrier delivers product. Focus on a single business process, and start with a few events.


It might not always be possible to find a subject (who). For example if you have an event like “weather station records temperature”. That is ok as long as you have a verb worth measuring.


We will show in the grey boxes how you can perform each step in Qlerify with the help of AI.

ℹ️ Step 1 in Qlerify: Who does what?

1. Create a project

2. Create a workflow

3. Click "Generate workflow with AI"

4. Type for example "process for a customer order including the roles customer, warehouse and delivery"

5. Press "Generate workflow" (Note that you can get different answers every time with AI)




2. Create the first BEAM Table

Pick an event together with the stakeholders to get started. For example "customer buys product". Use a spreadsheet, whiteboard or Qlerify and draw the first BEAM Table.


In the video below you can see several important BEAM notation conventions: The subject (CUSTOMER) and object (PRODUCT) are column headers. The verb (or connecting word) is placed in its own row above the object. These words make event stories readable. The column headers are the event details that will eventually become facts or dimensions. The table is now ready to record details and example data.

​​ℹ️ Step 2 in Qlerify: Create BEAM Table

1. Use AI to generate a suggestion for an Event Story





3. When?

Every event story has at least one defining point in time. No meaningful BI analysis takes place without a time element. Ask the first when questions by repeating the main clause of the event and add when.


When do CUSTOMERS order PRODUCTS?

or

CUSTOMERS orders PRODUCTS when?

To which stakeholders might respond:


On order date

You are looking for the field name. Sometimes only the date matters, sometimes we need the exact time in fractions of a second. Now that you have the subject, object and the initial when detail you can begin filling out the table with event stories.


4. Start capturing example data

Now it’s time to start filling out the table with example data. If your stakeholders quickly can provide it then you are modelling with the right people. Stakeholders who know their own data.


Example data clarifies the meaning of each event detail with minimum documentation. It helps stakeholders visualise how their data will appear on reports. Ask for five or six rows of example data according to the following structure:

  • Typical – Most typical values for each detail (most frequent customer, most frequent product etc.)

  • Different – Same as typical but different values in each field (if possible)

  • Repeat – One row that is as similar as possible to the typical one, in order to see what makes each row unique.

  • Missing – One row with as many “null”, “N/A”, “Unknown” values as possible, to discover mandatory fields.

  • Group – Can some values contain more than one value? More than one customer? A bundle of products?

Ask for additional When details and add them: “Are there any other dates and times associated with a customer ordering products”? The answer might be:

Yes, orders are due for delivery on a delivery due date.

5. Determining story type

After you have identified the when details and documented them with example data you can determine the story type, which in turn will give you clues about subsequent detail types you can expect.

  • Recurring Events – If the event contains a when detail with an "every" preposition it indicates a recurring event. It will often contain balance measure when you later ask how many.

  • Evolving Event – If you have two or more when details you may have an evolving event. For example “Orders are delivered on delivery date and paid on payment date.” In this case, all the dates are not set initially but added as the event evolves over time. In this case you should look for example data that illustrates the event’s initial and final states. Order and Payment could be modeled as two discrete events instead of one evolving event, it depends on how the stockholders view them.

  • Discrete event – By elimination, if an event is neither recurring nor evolving, it must be discrete. You can reconfirm this each time you discover a new event detail by asking if it’s example value can ever change. If the value never changes the event is discrete.

6. Who?

Now it’s time to find out whether there are other whos associated with the event (except for the subject in the main clause). Using the current main clause you might ask:

CUSTOMER orders PRODUCT from whom?

To which the stakeholders might reply:

Salesperson

If so, add the new who to the table and ask for example salespeople to match the existing event stories. (Typical, Different, Repeat etc.) On the Group example row you might ask:

Is there always just one SALESPERSON responsible for the order?

As you can see in the example below we discovered that sales person can be an individual or a team or sometimes missing.


7. What?

Next ask for any additional whats associated with the event. The general form of the question is: “Subject Verb Object with/for what?”. For example:

CUSTOMER pays MAINTENANCE FEE for what?

this might give you the what detail: SOFTWARE PRODUCT that would be added to the table with a for preposition. You can keep repeating variations of the of the what question to see if there are any more what details but be careful to not collect "detail about detail". As an example, product information (like product category, version etc.) normally belongs on the product dimension and not on the Event Story.


8. Where?

The next detail type to look for is where. You ask for this by using the event’s main clause with a where appended:

CUSTOMER orders PRODUCT where?

You are trying to figure out if wether the event occurs at a specific geographic location (or website address). If the stakeholders respons:

Online, or at a retail outlet.

Then extend the table to record the website URL or retail store location as a where detail of the event. In this example you can see three different types of location: store, website and call center.


9. How many?

How many questions are used to discover quantities associated with an event that will become facts in the physical data warehouse. Again you repeat the main clause of the event as a question to the stakeholders, but this time with the “how many” and its variants: “how much” , “how long” etc.

CUSTOMER orders how many PRODUCTS?
How much are PRODUCT orders worth?
How else would you measure this event?

In all cases you are looking for the name of the quantity field, not the value. The typical answer would be Order Quantity, Amount, Sum or Revenue.


10. Why?

Capturing why details is the next step for our Event Story. As with the other “W” questions you ask a why question using the main clause of the event:

Why do CUSTOMERS order PRODUCTs in thes quantities on these dates at these locations?

The answer might be for example:

PRODUCT PROMOTIONS

You would expand the event table as shown in the figure below and add example stories that illustrate typical and exceptional circumstances.


11. How?

The final “W” questions discover any how details. How refers to the actual mechanisms of the business event itself. You discover these details by asking how questions using the main clause of the event.

How does a CUSTOMER order a PRODUCT?
How do you know that a CUSTOMER ordered a PRODUCT? What evidence do you have?

Often how details include transition identifiers from the operational system(s) that capture each event. If the stakeholders respond with:


A customer or salesperson creates an ORDER with and OREDER ID.

Then you would add ORDER ID to the table as in the picture below.


You should ask further questions to find out if there are any more descriptive how details. You are typically looking for methods and status descriptions. A suitably rephrased how question might be:

In what ways can a CUSTOMER order a PRODUCT?

To which the stakeholder might respond:

Using a credit card or a purchase order. We’ll call that PAYMENT METHOD.

12. Finalise the Event Story

Complete the Event Story by double checking the Event Granularity. Each story (row in the table) must be uniquely identifiable. Discovering the Event Granularity is the job of the repeat story line. If every detail in the repeat story matches the typical story you don’t have enough details to define the granularity. Then usually you can use a transaction identifier such as Order ID to differentiate the rows.

​​​ℹ️ How to update the rows of example data in Qlerify




By scanning the completed table the stakeholders can now read their finished event stories such as:


Customer orders product, on order date, for delivery on delivery due date, to delivery address, for delivery to delivery address, in quantity, with comment, using order ID.


Next

You have completed the first event. Next up, model the dimensions to generate a star scheme before moving on to other events. Once you have a library of dimensions, you can proceed rapidly from event to event. Just be careful to not model too many events at one time. Just model enough so that it easily fits in to the next iterations.


Modelling Business Dimensions

In the second part of this tutorial we will describe how to complete the model with a fact table, dimensions and a star scheme. Soon to be published. Here is a videos showing a quick way to run through all the steps using Qlerify and our Chat GPT integration.


Demo showing all the steps:


573 views
bottom of page