Font Size: a A A

Research And Implementation On Managing Big-data In Astronomical

Posted on:2014-09-11Degree:MasterType:Thesis
Country:ChinaCandidate:J H YaoFull Text:PDF
GTID:2250330422951980Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years, with the rise of cloud computing, cloud storage, malicious dataare generated in people’s daily lives and all aspects of the professional field. Inastronomical area, the LSST (Large Synoptic Survey Telescope) project was raisedin recent years to observe the whole sky for ten years. During these10years, about60PB data will be collected and stored in the database for public and astronomersto reveal the new sky and make new discoveries about the whole universe.Obviously,60PB data would be a big challenge to current database managementsystem.Traditional database management systems such as Microsoft SQL Server,MySQL and so on are not suitable for processing big-data. Consequently, newtechnologies are proposed to improve performance on processing big-data. Analnertative solution is to build clusters via several cheap computers such as Hive,HadoopDB, IBM DB2and Oracle RAC. This thesis has analyzed the requirementsof the database management system, studied the new features and techniquesprovided by Oracle. Then the performance of a variety of queries is tested to checkthe performance of Oracle RAC. Moreover, this thesis will also propose some newmethods combining traditional techniques such as index, partitioning with parallelexecution to improve query performance on currently existing database. And acomparison of performance between Oracle RAC, Hive and HadoopDB will bemade to demonstrate the features of Oracle RAC. Besides, the scalability andextensibility of Oracle Real Application Cluster will also be tested to see if it isfeasible to satisfy LSST project as super computers are too expensive and a clusterof cheap computers can also provide high performance, high availability and highflexibility. In order to test performance of queries much more easily, a testingperformance tool is built to simplify the testing process.In practice, this thesis provides methods on optimizing the performance ofqueries on currently existing database schema as a part of Large Synoptic SurveyTelescope project in France.
Keywords/Search Tags:Oracle RAC, Index, Partitioning, parallel execution, DB optimization
PDF Full Text Request
Related items