The conference starts at 9:30 am and ends at 5:15 pm on both conference days. Registration commences at 8:30 am.
Tuesday, March 28, 2017 |
||
9:30 | Opening by the conference chairman Rick van der Lans | |
Session 1 Room 2 |
Fast Data: The Next Frontier of Big Data Rick van der Lans |
|
Case Room 2 |
Data driven management: How data scientists employ an advanced data platform and self-service analytics in the DJ & Entertainment industry Edwin Witvoet and Mischa van Werkhoven |
|
Session 2A |
Organizing the Data Lake: How to Extend Data Management beyond the Data Warehouse Mark Madsen |
|
Session 2B |
The Road to the Amsterdam Smart City Infrastructure Rutger Rienks |
|
Session 3A |
Rocking Analytics in a Data Flooded World Bart Baesens |
|
Session 3B |
Agility through Data Virtualisation: from Data Vault to SuperNova to a Logical Data Warehouse Jos Kuiper |
|
Case Room 2 |
Change your current data warehouse to an agile data warehouse by adding more complexity Jos Driessen |
|
Session 4 |
Agile Project Management for Data Warehouse and Business Intelligence Projects William McKnight |
|
17:15 | Reception |
Wednesday, March 29, 2017 |
||
9:30 | Opening by the conference chairman Rick van der Lans | |
Session 5 |
Logical Data Lake and Logical Data Warehouse: two sides of the same coin Rick van der Lans |
|
Case Room 2 |
Continuous Integration in Business Intelligence. Innovation driven, culturally inspired. Martin Pardede and Lukas Ames |
|
Session 6A |
Strategies for Consolidating Enterprise Data Warehouses and Data Marts into a Single Platform William McKnight |
|
Session 6B |
The renewed BI-landscape of the Erasmus MC using the Scrum Framework and Data Virtualization Kishan Shri |
|
Session 7A |
IoD: Internet of Data Pieter den Hamer |
|
Session 7B |
Implementing the Enterprise Data Delivery Platform using Data Vault Modelling Erik Fransen |
|
Case Room 2 |
Instrument your business with an enterprise-ready data lake Rixt Altenburg and Rens Weijers |
|
Session 8 |
Beer, Diapers and Correlation: a Tale of Ambiguity Mark Madsen |
Daily schedule:
16:00 – 17:15 Session 4
On the 28th of March, there will be a reception after the final session.
1. Fast Data: The Next Frontier of Big Data (Dutch spoken)
Rick van der Lans, Managing Director, R20/Consultancy
In the first generation of big data systems, the focus was primarily on storing and analyzing very large amounts of data. The focus was entirely on volume. Currently, organizations have entered the second phase of big data: fast data. Fast data is about streaming and instant analysis of large amounts of data. This is the world of the Internet of Things (IoT), where interconnected devices communicate with each other over the Internet, but also of machine-generated sensor data and weblogs. Everything revolves around speed. Fast data is clearly the next frontier of big data systems. And most organizations will have to deal with this now or in the future, from the most traditional financial institutions to manufacturers and online gaming companies.
In this massive flow of data, valuable business insights are often deeply, very deeply hidden. The business value of fast data is in the analysis of all this streaming data. Unfortunately, the analysis of fast data is different from the analysis of enterprise data stored in data warehouses, in that it is using data visualization tools. For example, fast data can be very cryptic in nature, so it often has to be combined with enterprise data, which in turn is stored in the data warehouse. And to be able to do something useful with it, this data should be analyzed in real-time, because an immediate response is expected. Sometimes the data has to be analyzed even before it is stored. The world of fast data is a new world. This session discusses the architectural aspects of fast data, provides guidelines for adopting fast data and explains how fast data can be integrated within the existing BI environment.
- What is the relationship between fast data and the classic world of business intelligence and data warehousing?
- A new architecture is necessary for processing fast data
- Technologies and products are needed for analyzing fast data
- How should fast data be integrated in the enterprise data warehouse?
- The challenge of real-time responding to incoming fast data
- What is the relationship between big data and the IoT?
2A. Organizing the Data Lake: How to Extend Data Management beyond the Data Warehouse
Mark Madsen, President and founder of Third Nature
Building a data lake involves more than installing and using Hadoop. The focus in the market has been on all the different technology components, ignoring the more important part: the data architecture that the code implements, and that lies at the core of the system. In the same way that a data warehouse has a data architecture, the data lake has a data architecture. If one expects any longevity from the platform, it should be a designed rather than accidental architecture.
What are the design principles that lead to good functional design and a workable data architecture? What are the assumptions that limit old approaches? How can one integrate with or migrate from the older environments? How does this affect an organization’s data management? Answering these questions is key to building long-term infrastructure.
This talk will discuss hidden design assumptions, review some design principles to apply when building multi-use data infrastructure, and provide a conceptual architecture. Our goal in most organizations is to build a multi-use data infrastructure that is not subject to past constraints. This conceptual architecture has been used across different organizations to work toward a unified data management and analytics infrastructure.
You Will Learn:
- Data architecture alternatives that are able to adapt to today’s data realities
- New ways of looking at technology that can be applied to address new problems inherent in today’s uses and scale of data
- Methods and techniques to migrate from older data architecture to new ones that resolve today’s problems and prepare for the future
2B. The Road to the Amsterdam Smart City Infrastructure (Dutch spoken)
Rutger Rienks, Program Manager Datapunt, Gemeente Amsterdam.
This session will be about one of the most modern data infrastructures in de the world. Using open source components and a scrum/agile way of working the Amsterdam smart city infrastructure is realized.
The hurdles that need to be taken in a huge governmental organization as well as the personal and technical challenges will be covered. Also, given actual smart city cases the potential of information led decision making and decision support in the smart city will be discussed.
- Why a local government should transform into an intelligence driven organisation?
- How to influence an organization to kick-start the movement?
- Some examples of increased well-being of citizens and improved governmental activity.
- How to mobilize city workers to abide by advice presented by theoretical models
- Insights in the ambitions of the smart city datainfrastructure
3A. Rocking Analytics in a Data Driven World (Dutch spoken)
Bart Baesens, professor at the KU Leuven and lecturer at the University of Southampton (UK)
Companies are being flooded with tsunamis of data collected in a multichannel business environment, leaving an untapped potential for analytics to better understand,
manage and strategically exploit the complex dynamics of customer behavior. In this presentation, we will start by providing a bird’s eye overview of the analytics process model and then illustrate how to fully unleash its power in some example settings. We will review data as the key ingredient of any analytical model and discuss how to measure its quality. We will zoom into the key requirements of good analytical models (e.g. statistical validity, interpretability, operational efficiency, regulatory compliance etc.) and discuss emerging applications. Throughout the presentation, the speaker will extensively report upon his research and industry experience in the field. Attendees will learn:
- The impact of data quality on analytical model development
- The key requirements for building successful analytical applications
- The trade-off between accurate and interpretable analytical models
- Emerging analytics applications and accompanying challenges
- State-of-the art research and industry insights on Big Data & Analytics.
3B. Agility through Data virtualization: from Data Vault to SuperNova to a Logical Data Warehouse (Dutch spoken)
Jos Kuiper, IT Enterprise Architect, Volkswagen Pon Financial Services
How can we improve agility in preparing data for end-users and for information products, like reports, dashboards etc.? For this purpose a proof of concept with a data virtualization solution was performed.
Given a number of challenges in a traditional BI architecture, Jos will dive into the merits of a data virtualization solution. In a proof of concept a data virtualization solution was tested, on top of a Data Vault Data Warehouse. Also, the capability of the data virtualization solution to combine historic data, stored in the Data Vault, with live data stored in back-office systems, was subject of investigation. In this presentation there will also be a brief introduction to the data modelling methods Data Vault and SuperNova. This session will provide insights on:
- What is the occasion for data virtualization?
- How the proof of concept was conducted, with a brief introduction to the data modelling methods Data Vault and SuperNova
- The benefits of a data virtualization solution
- The fit of data virtualization in a BI/data warehouse architecture
- And an unexpected benefit …
4. Agile Project Management for Data Warehouse and Business Intelligence Projects
William McKnight, President McKnight Consulting Group
- When Does Agile Apply to DW&BI Projects
- Getting Set Up for Agile
- Agile Terms to Use and Do
- DW&BI Agile Roles
- Implications & Challenges of Moving to Agile
- Organizational Change Management and Agile
5. Logical Data Lake and Logical Data Warehouse: two sides of the same Coin (Dutch spoken)
Rick van der Lans, Managing Director, R20/Consultancy
Of course, it can be very useful to have one environment in which all of the data in their original (raw) form can be found. A data lake is certainly very useful for data science and investigative analytics. But the main question is whether it really needs to be a physical data store as suggested by the experts? Is it not sufficient that users can access a system that gives them access to all the data in their original form? Or, why not a logical (or virtual) data lake? The technology is available, such as data virtualization servers, and is mature enough to develop logical data lakes. It would greatly reduce the copying of massive amounts of big data from its source to the data lake.
But what is the difference between a logical data lake and a logical data warehouse? Don’t they work in the same way? Aren’t they actually the same? Both present a heterogeneous set of data sources as one large logical database. This session discusses how the two concepts, logical data lake and logical data warehouse, can be integrated, and how they can still support the typical data lake and data warehouse workloads. We are talking about two sides of the same coin here. One integrated architecture is shown that supports both modern concepts.
- What are the restrictions of a physical data lake and what are the advantages of a logical data lake?
- The differences between a data lake and a data warehouse: schema-on-read, highly agile, unstructured and semi-structured data, low-cost storage
- What are the practical advantages of the logical data warehouse architecture and what are the differences compared to the classic data warehouse architecture?
- Guidelines for setting up one integrated architecture for a logical data lake and a logical data warehouse
- Several real-life experiences with implementing logical data lakes and logical data warehouses.
6A. Strategies for Consolidating Enterprise Data Warehouses and Data Marts into a Single Platform
William McKnight, President McKnight Consulting Group
- Inefficient Information Architecture
- Methods of Data Mart Consolidation
- Databases and data warehouse continued relevance
- Many data warehouses, 1 data warehouse
- Columnar orientation to databases
- Keys to Data Mart Consolidation Success.
6B. The renewed BI-landscape of the Erasmus MC using the Scrum Framework and Data Virtualization (Dutch spoken)
Kishan Shri, Advisor Business Intelligence and Scrum Master at Erasmus Medical Center (MC)
Kishan Shri, Business Intelligence Consultant and Scrum Master at the Erasmus MC, will explain the hospital’s approach and the strategic choices that were made along the way as well as lessons learned. These include:
- A vision on a data-driven hospital
- The renewed BI-architecture of the Erasmus MC, including the positioning of a data virtualization platform
- Lessons learned regarding:
- Vendor selection for data virtualization
- Implementing data virtualization
- Adopting and implementing the Scrum Framework in relation to data virtualization.
7A. IoD: Internet of Data (Dutch spoken)
Pieter den Hamer, Lead Big Data, Business Intelligence & Analytics, Alliander
The Internet of Data – as a pragmatic reincarnation of the semantic web – promises to behave better. Concepts and techniques, such as linked (open) data, ontologies, OWL, RDF and SPARQL can help to link data in an ‘agile’ way, but then, focused towards domain specific applications, instead of on a global scale.
Nevertheless, the Tower of Babel continues to interfere – fortunately AI and (deep) machine learning seem to be increasingly capable to overcome differences in semantics and language. We can observe The Internet of Data in real world initiatives like smart city and smart society. And who knows, might the ‘Intranet of Data’ invoke the end of the trusted enterprise data warehouse?
- How the ‘Internet of Things’ leads to the ‘Internet of Data’: the growing need for agile data sharing and integration
- Beyond the semantic web: from idealism to pragmatism
- State-of-the-art technology & tools
- The problem of semantics: why IoD may fail (as well) and how AI may come to the rescue
- From enterprise data warehouses to enterprise linked data networks: the Intranet of Data
- IoD in the public sector and smart cities.
7B. Implementing the Enterprise Data Delivery Platform using Data Vault Modelling (Dutch spoken)
Erik Fransen, Management consultant at Centennium
Organizations continue to struggle with the challenges they face in delivering Data and Analytics solutions to their customers. Many still have the classic reporting factory in place, originating from the 90’s datawarehouse and business intelligence architectures. And let’s face it: although it did deliver value in creating the standard reports, delivering data in a fast, integrated and bespoke way for interactive analysis was never its true intention. This is where these data architectures now fail in the new era of Data & Analytics where instant analysis, any data access, data integration, fast and easy delivery is crucial in satisfying the user demands. Users demand real time delivery of both enterprise data and data from other sources, fast implementations, big data access, BI and ETL self service, impact analysis and lineage insights. These next generation Enterprise Data Delivery Platforms (EDDP) should make use of modern ensemble modelling methods like Data Vault to become more agile, flexible and transparant in adapting to any data source, in a uniform, consistent way while using data virtualization technology for fast delivery to the user.
- Data & Analytics challenges: more, faster, better, cheaper and easier
- Trimodal Data & Analytics streams: innovative predictability versus predictable innovation
- Short history of Data Vault: from modelling the classic EDW to modelling the next generation EDDP
- Data Vault and the Enterprise Data Delivery Platform: implementing the logical datawarehouse
- Data Vault Use cases
8. Beer, Diapers and Correlation: A Tale of Ambiguity
Mark Madsen, President and founder of Third Nature
The story of the correlation of beer and diaper sales is a common one, still used to discuss the value of analytics in retailing and marketing. Rarely does anyone ask about the origin of this story. Is it true? Why is it true? What does “true” actually mean?
The latter question is the most interesting because it challenges beliefs about the usefulness and accuracy in analytic models. Many people believe that data is absolute rather than relative, and that all analytic models produce an answer rather than a range of answers.
This is the history of the beer and diapers story, explaining its origins and truth (or falsehood), based on repeated analysis of retail data over the intervening decades. It will explain how one can have multiple, contradictory results and how they can all be simultaneously true. This brings up the real question: how does one apply analytics in business when the data does not give you an unambiguous answer?
- Lessons learned with applying analytics and data science
- Is the notorious beer diaper example really true?
- Data science is one half of the solution, interpreting the results is the other half
- There is no single version of the truth
- It’s not the insight, but what you do with it, that matters.
Cases
Edwin Witvoet, Chief Executive Jibe Company
Mischa van Werkhoven, Principal Solution Architect
Festivals, DJs and artists make use of a range of networks to ensure growth of their ‘media company’. This enables them to keep building on fan relations, market share, sponsors, sales etc. However, with that much data, understanding how these networks contribute to the company goals, is becoming more and more complex. Find out how Jibe uses Qlik to offer Data Management & Intelligence.
Jos Driessen, founder BI Demystified
But how can adding more complexity make you more agile? Experts can tell you how to build an agile data Warehouse. Use a Data lake and concepts like Logical data Warehouse, Data Vault and Supernova.
The answer is data warehouse automation: In this session, learn how programming robots can generate a data warehouse for you. You retain control. You decide what to build. But you don’t have to dive into the details. Robots will do that for you.
This will change your projects into initiatives and enable Business analytics.
Martin Pardede, BI Delivery Manager, Bestseller
Lukas Ames, IT consultant, Cimt
Bestseller is experiencing strong growth in its E-commerce department, the division is going through a process from being in essence a successful start-up to becoming a future leader in fashion e-commerce.
To have a BI system that is suitable to cope with this rapid expansion not only the technical aspects were considered but also a process that ensures quality of the system. This session will offer you a view of how Bestseller combines company-culture with best-practices from the developer-community. We will talk about how we created a way of working with: Scrum, Jira, Talend, Bitbucket and TeamCity.
Rens Weijers, Director Data & Performance Management, Nuon Vattenfall
Rixt Altenburg, Manager Customer Insights, Nuon Vattenfall
It is often clear to analists and other data professionals: the traditional BI environment is insufficiently capable of making use of all available data. A data lake is required to process large amounts of data, store new data formats and get the necessary tools swiftly and efficiently to combine and analyze the data. Making this step is required to generate new customer insights or to improve internal processes. The question is how to realize a swiftly implemented data lake? In a corporate environment, with lots of various stakeholders, this is not an easy step. Furthermore, because of the subject being unknown to the organization, there will be a lot of uncertainties about how to get it fully operational.
Subjects that will be discussed are:
- The cooperation with IT
- How to get senior management on board
- Implementing a data lake and the operational challenges
- Lessons learned from the Nuon case