Reliable data persistence is one of the most crucial and sensitive engineering problems in IT sector. For long, Relational Database Management System (RDBMS) has been a well-established and mature technology, in which data are stored in tables. Although, a so-called entity-relational model is human readable, a good design principle requires data normalization in order to avoid duplications and inconsistencies and eventually to come up with a reliable database scheme. As a side effect, with normalization many table fields start referencing auto-generated foreign-keys and data become “dirtier” for a human to understand.
More importantly, it leads to more complicated join queries which reasonably affect maintainability. Despite the undesired complexity to maintain and query data, relational databases were for long adopted as the appropriate choice as they came with a key feature: ACID support. In short, ACID properties guarantee that once the data was committed, it will be accessible to future queries. The need for ACID motivated the engineers to investigate ways to overcome the problems above. A well-accepted way is to add indexes which make lookups faster, but still data query becomes expensive as long as the data size grows. The negative effect of data size becomes particularly important by bringing also big-data in the foreground, where the query performance degradation becomes a real bottleneck in applications that require a low processing time.
This is one of the main reasons that NoSQL approach was born. NoSQL revolution is about making it possible (and easy) to query more and more data. To do so, some tradeoffs should be applied to make databases faster. As a first thought, ACID was the first requirement to go. Indeed, after all with millions and millions of transactions it’s probably accepted (depending however on the use case) to miss a couple here and there. Another bet was to make it easier to interact with large volumes of data. The “spontaneous” solution to this problem was to popularize different interfaces. Hence, the most implicit of it was the key-value interface. According to this approach, one simply stores a value with a key and later can use that key to access the value. Although this simple method fits well with some use cases, a single value sometimes is opaque and as such, meaningless to the database. If (from the use case point of view) opaque values are not desired, a document-based database could be more appropriate. In short, this case is appropriate to manage semi-structured data and still requires a key that looks up a document. Moreover, one can also look up a document based on its contents which are indexed for faster retrieval.
Based on all mentioned above, it is understood that there is a plenty of database technologies out there that try to deal with a common objective: To minimize the query burden without compromising on the query processing time. In other words, all database technologies focus more on the data processing mechanisms, ignoring the other equally important aspect which is the relationships between data. Indeed, although the relationships are treated as a kind of metadata, they can be viewed as holding valuable information that for some use cases appear to be more important than data.
By the end of this training, a participant is expected to be able to:
- Explain the main differences of RDBMS and NoSQL technologies
- Explain the pros and cons of each NoSQL model, from deployment complexity point of view, as well as from the API point of view.
- Be confident to cope with Shell commands and software implementation using NoSQL databases.
- Be confident to make the correct design decisions in a real-world problem that involves NoSQL.
- This specific Code.Learn program lasts 3 days (Thursday, Friday & Saturday) with 16 hours of lectures and hands-on exercise on a real life project.
Key Objectives – Curriculum (High level)
The core perspectives of this program will be to present, explore and adequately cover with extended real-life business case studies & industry scenarios the following aspects:
- RDBMS vs. NoSQL Alternatives
- Document space model
- Key – value model
- Architecture Overview, Key Features, Data model, API, Demo
The lessons can be carried out:
- Inside a physical classroom with an instructor,
- In an online environment as a virtual classroom, with live connection with the instructor through video conferencing; or lastly,
- A combination of both physical and online.
The method of teaching will depend on the current conditions, and also on the participants’ preferences.
Regarding online, the instructor provides the taught material through screen sharing, live broadcast, or by working on the cloud where attendees can see and interact with everything in real-time. Attendees themselves can seamlessly and actively participate and ask questions, as they would in a physical classroom. Additionally, they can collaborate in team projects and deliver assignments and hands-on projects that the instructor can see and provide feedback easily and without delays.
Data engineers, database engineers, DBAs, DW engineers, developers, computer scientists, software engineers and developers are welcome to participate to this code.learn program and unlock the full potentiality of the topics taught by upskilling their future career.