Lade Inhalt...

Data Migration from Relational Database to MongoDB

Akademische Arbeit 2019 8 Seiten

Informatik - Software

Leseprobe

I. INTRODUCTION

A Database is usually defined as collection of data and the system that handles data, transactions, problems and issues of the database is known as Database Management System (DBMS). A Database was incepted in 1960 in order to satisfy the need of storing and finding data. It began with navigational databases which were based on linked-lists, moved on to relational databases with joins, and then afterwards object-oriented and then without joins in the with NoSQL (MongoDB, Cassendra, HBase/Hadoop, CouchDB, Hypertable etc.), emerged and has become a popular trend [1] in the late 2000s. On one hand, there is relational database management system (RDBMS) that offers more consistency as well as powerful query capabilities and a lot of knowledge and expertise over the years [2]. On the other hand, NoSQL approach, which offers higher scalability i.e. it can run faster and support bigger loads.

Abstract — M ongoDB is a document-oriented database which helps us group data more logically. This paper demonstrates the conversion of data from a native tabular form to unstructured documents. The document and collections within it needs not to be well defined prior to the creation of unstructured data in MongoDB. The MongoDB has lots of extensive built-in-features and is highly compatible with other software systems, with extensive and flexible ways of accessing data beyond JSON query, its highly compatible Business Intelligence Connector is highly compatible which makes it compatible with existing databases. High scalability is making it remarkable and popular in the World and hence made me think about writing a paper demonstrating the data conversion. This conversion has helped me in making the most of modern data to be compatible with MongoDB. Data is stored on the cloud as cloud-based storage is an excellent and most cost-effective solution. My solution is highly scalable as the built-in shading solution for data handling makes it one of the best big data handling tool. The data that i have used, is location based in MongoDB that can directly yeild document ACID transactions to maintain data integrity.

Keywords — Data Migration, Relational Database, MongoDB, XAMPP, NoSQL, ACID Transactions

The NoSQL in general doesn’t support complicated queries and doesn’t have a structured schema. It recommends de-normalization and is designed to be distributed (cloud-scale). Because of the distributed model, any server can answer any query. Servers communicate amongst themselves at their own pace. The server that answers our query might not have the latest data.

NoSQL characteristics:

1. Can handle large data volume.
2. Scalable replication and distribution – The NoSQL automatically spreads the data into multiple servers without requiring application assistance. Servers can be added or removed from the data layer without application downtime.
3. Schema-less: The Data can be inserted in a NoSQL database without first defining a rigid database schema. The format of the data being inserted can be changed anytime without application disruption. This provides immense application flexibility that delivers substantial business flexibility.
4. Open Source development

This research paper is based on data conversion using Mongo DB. I have demonstrated data conversion from traditional database to MongoDB document-based database. Section II covers the related work done in this field. In section III, the performance of the proposed concept has been verified with the aid of XAMPP and MongoDB management system. Results are followed up in Section IV presents the conclusion.

II. RELATED WORK

This research work is based on MongoDB in which the relations within the applications are well defined by two main tools which are references and embedded documents. The references store the relationships between data by including links or references from one document to another. The applications can resolve these references to access the related data. Having the URL link within the field definitely adds to its range of accessing data. The embedded documents capture the relationships between data by storing related data in a single document structure. The MongoDB documents make it possible to embed document structures in a field or array within a document along with a new feature.

The MongoDB is derived from the word ‘humongous’ as the database has a characteristic of exceptional scale-up capability. So, big data are huge massive exponentially changing and growing data sets [3]. The traditional framework was not able to handle and analyze large amount of data. Also the internet of things (IoT) is creating exponential growth in data. The Big data and linked data are two faces of the same coin both representing the integral and concentric part of web based world. Each data element is connected via a URL’s and are identifiable, locatable and accessible. Stronger the bond between the two, higher will be the unsustainability and big data utility. Semantically well structured, interconnected and linkied data will add intelligence to the data. The main goal is to make data more interactive, participatory and innovative. Making sense and extracting information from the data for organizational benefits. So, semantic technologies can help us extracting the main value from the data. Test analytics aids in the conversion of unstructured text to structured meaningful text by deriving and extracting patterns.

Database scaling is a highly difficult task. The modern applications require high scalability and data curating capacity. I have used XAMPP which is apache distribution web server solution for creating a local web server, after setting the environment for the database was tested. The JSON enables the transferring data between the master server and the web application in human readable form. This improved the overall efficiency.

III. PROPOSED WORK

The key benefits of NoSQL are speed, scalability, price, flexibility and simplicity. The main characteristics is its Non-adherence to relational database concept. As an example, Grolinger et al [4] identified the difficulties in handling information which can be the huge Map Reduce. Naheman and Wei [5] studied and compaired various BigData tools like HBase and other NoSQL databases eg Bigtable, Cassandra, CouchDB, and MongoDB etc. Laurence [6] worked on a virtualization system which does allow us to enquire and join information making use of a sql query and the result in API that is underlying to MongoDB.

Hadoop - the main frame technology being used is a Java based framework that supports the processing and storage of tremendously large data sets in a distributed computing environment. Even though, as we all we know that Hadoop is written in Java programming language, programs for Hadoop can be written in other languages like Python. Mostly, Python code is translated to Java jar files using Jython. The Hadoop has two major areas of concern. The first one is Hadoop which is built using Java and hence the application developers should know Java to develop the framework and to develop the map reduce technologies. The Hadoop provides a framework by which Map Reduce applications can be built using python. The second is the co-sister technologies which work on top of this framework is one of the example being Cassandra, Cassandra being a NoSQL database technology which is ideal for high-speed online transactional data, while as Hadoop being a big data analytic system.

Apache Spark once a component of the Hadoop ecosystem is now fetching big data platform of choice for enterprises. Surveys reveal that many data analyst and data scientist preferred spark over map reduce, which is batch-oriented and it does not offer itself for interactive applications and real time stream processing. As we advance towards the Internet of things, we towards the era of sensor based things too, the sensors that are intended to send the data back to the mothership repository. The data that we deal with is mostly very complex and is deployed across various relational and non-relational systems. However, the demand for analytical tools is increasing. Such tools helps us in extracting and utilizing data stored anywhere. Even for the sensor input data, the input data is tremendous. To cater the efficiency, the Metadata catalogues help us to relate and understand data. The Machine learning is definitely automating the task of finding data in Hadoop. Some of the emerging tracks in Big Data are in the fields of sensing and Internet of things Services, Smart City Data, Big Data Networking.

[...]

Details

Seiten
8
Jahr
2019
ISBN (eBook)
9783668949850
Sprache
Englisch
Katalognummer
v468851
Institution / Hochschule
Patna Women's College – Patna Women's College
Note
8.6
Schlagworte
Data Migration Relational Database MongoDB XAMPP NoSQL

Autor

Teilen

Zurück

Titel: Data Migration from Relational Database to MongoDB