Friday, 14 June 2013

Datastage!!




What is DataStage?
o Design jobs for Extraction, Transformation, andloading (ETL)o Ideal tool for data integration projects - such as,data warehouses, data marts, and systemmigrationso Import, export, create, and managed metadata for use within jobso Schedule, run, and monitor jobs all withinDataStageo Administer your DataStage development andexecution environments

A tool for designing Extraction, Transformation and Loading

- An ideal tool for data integration projects system migrations

- Importing, extracting and creating metadata are within these jobs

- Data stage allows scheduling, monitoring and running the jobs 

- Allows to administer the development and execution in a single environment.
DataStage is a central filestore with three added benefits:
Security controls that allow researchers to have a "private" area only accessible to themselves and the group leader, and "shared" and "collaborative" areas to put files of use to the whole research group.
Web interface allowing users to annotate their files, and access data from outside their "home" computer.
The option to send data to a repository for permanent storage.
A likely scenario is that each research group (or project) would have its own instance of DataStage, for internal use. The best of this data would be sent to a repository for permanent archival/publication.
Name the command line functions to import and export the DS jobs?
To import the DS jobs, dsimport.exe is used and to export the DS jobs, dsexport.exe is used.
What is the difference between Datastage 7.5 and 7.0?
In Datastage 7.5 many new stages are added for more robustness and smooth performance, such as Procedure Stage, Command Stage, Generate Report etc.
In Datastage, how you can fix the truncated data error?
The truncated data error can be fixed by using ENVIRONMENT VARIABLE ‘ IMPORT_REJECT_STRING_FIELD_OVERRUN’.
Define Merge?
Merge means to join two or more tables. The two tables are joined on the basis of Primary key columns in both the tables.
Differentiate between data file and descriptor file?

As the name implies, data files contains the data and the descriptor file contains the description/information about the data in the data files.