Gillblad, Daniel and Kreuger, Per and Levin, Björn and Rudström, Åsa (2005) Preparation and analysis of multiple source industrial process data. [SICS Report]
Industrial process data is often stored in a wide variety of formats and in several different repositories. Efficient methodologies and tools for data preparation and merging are critical for efficient analysis of such data. Experience shows that data analysis projects involving industrial data often spend the major part of their effort on these tasks, leaving little room for model development and generating applications. This paper identifies and classifies the needs and individual steps in data preparation of industrial data. A methodology for data preparation specifically suited for the domain is proposed and a practically useful set of primitive operations to support the methodology is defined. Finally, a proof of concept data preparation system implementing the proposed operations and a scripting facility to support the iterations in the methodology is presented along with a discussion of necessary and desirable properties of such a tool.
|Item Type:||SICS Report|
|Uncontrolled Keywords:||Data Preparation Methodology, Multiple Source Data Merging, Data Analysis, Data Mining, Data Cleaning, Data Preprocessing|
|Deposited By:||Vicki Carleson|
|Deposited On:||29 Oct 2007|
|Last Modified:||18 Nov 2009 16:07|
Repository Staff Only: item control page