Informatica MDM Interview Questions & Answers

Master data management (MDM) is a comprehensive method of enabling an enterprise to link all of its critical data to one file, called a master file, that provides a common point of reference. When properly done, MDM streamlines data sharing among personnel and departments.

There are two types of table involved in Dimensional Modeling and this model concept is different from the third normal form. Dimensional data model concept makes use of facts table containing the measurements of the business and dimension table containing the measurement context.

A data movement mode determines how power center server handles the character data. We choose the data movement in the Informatica server configuration settings. Two types of data movement modes available in Informatica.

It’s a matter of awareness and the problem becoming urgent. We are seeing budgets increased and greater success in closing deals, particularly in the Pharmaceutical and Financial services industries. Forrester predicts MDM will be $6 billion markets by 2010, which is a 60 percent growth rate over the $1 billion MDM market last year. Gartner forecasted that 70 percent of Global 2000 companies will have an MDM solution by the year 2010. These are pretty big numbers.

We can export repository and import into the new environment
We can use Informatica deployment groups
We can Copy folders/objects
We can Export each mapping to XML and import in new environment

Offline Operational Databases – Data warehouses in this initial stage are developed by simply copying the database of an operational system to an off-line server where the processing load of reporting does not impact on the operational system’s performance.

Offline Data Warehouse – Data warehouses in this stage of evolution are updated on a regular time cycle (usually daily, weekly or monthly) from the operational systems and the data is stored in an integrated reporting-oriented data structure.

Real Time Data Warehouse – Data warehouses at this stage are updated on a transaction or event basis, every time an operational system performs a transaction (e.g. an order or a delivery or a booking etc.)

Integrated Data Warehouse – Data warehouses at this stage are used to generate activity or transactions that are passed back into the operational systems for use in the daily activity of the organization.

It is a repository object that helps on generating, modifying or passing data. In a mapping, transformations make a representation of the operations integrated with service performs on the data. All the data goes by transformationports that are only linked with a mapplet or mapping.

Foreign keys of dimension tables are primary keys of entity tables.
Foreign keys of facts tables are primary keys of Dimension tables.
(Latest 42 Informatica Data Quality Interview Questions)

A Mapplet is a reusable object that contains a set of transformations and enables to reuse that transformation logic in multiple mappings.

There are two different ways to load data in dimension tables.

Conventional (Slow) – All the constraints and keys are validated against the data before, it is loaded; this way data integrity is maintained.
Direct (Fast) – All the constraints and keys are disabled before the data is loaded. Once data is loaded, it is validated against all the constraints and keys. If data is found invalid or dirty it is not included in the index and all future processes are skipped on this data.

Designed by Informatica Corporation, it is data integration software providing an environment that lets data loading into a centralized location like data warehouse. From here, data can be easily extracted from an array of sources, also can be transformed as per the business logic and then can be easily loaded into files as well as relation targets.

MDM stands for Master Data Management. It is a comprehensive method used to enable an enterprise for linking all of its critical data to single file also known as master file, providing a common point of reference. When done in a proper manner, MDM helps in streamlining the process of data sharing among departments and personnel.

This specifies the parallelism’s degree that is set upon the base object table as well as its related tables. Although it doesn’t occur for all batch processes, it can have a positive consequence on performance once it’s used. Nevertheless, its use is restricted by the number of CPUs on the database server machine along with the amount of available memory. 1 is the default value.

There is huge awareness of MDM. Gartner recently hosted an MDM conference for the first time [piggy-backing on its CRM conference], and they pulled in about 500 attendees.

As to whether they “get it,” it depends on who you’re talking to. Most of the IT people get it. Business users understand the moniker, but they might or might not understand MDM quite as well. I find that business users often require education in terms of what it can do for them and what value it brings. With IT people, it’s a different conversation; they want to know more about the features and how we differentiate ourselves from the competition.

Technical folks often have a challenge in data governance in selling the project and getting the funding. Management is looking for return on investment; they need MDM tied to quantifiable benefits that business leaders understand, like dollar amounts around ROI.

There are various fundamental stages of Data warehousing. They are:

1. Offline Operational Databases: This is the first stage in which data warehouses are developed simply by copying operational system database to an offline server where the dealing out load of reporting not put any impact in the performance of operational system.
2. Offline Data Warehouse: In this stage of development, data warehouses are updates in regular basis from the operational systems. Plus, all the data is stored in an incorporated reporting-oriented data structure.
3. Real Time Data Warehouse: During this stage, data warehouses are updated on an event or transaction basis. Also, an operational system executes a transaction every time.
4. Integrated Data Warehouse: This is the last stage where data warehouses are used for generating transactions or activity passing back into the operational system for the purpose of use in an organizations daily activity.

Dimensional data model concept involves two types of tables and it is different from the third normal form. This concept uses Facts table which contains the measurements of the business and Dimension table which contains the context (dimension of calculation) of the measurements.

A transformation is a repository object that generates, modifies or passes data. Transformations in a mapping represent the operations the Integration Service performs on the data. Data passes through transformation ports that are linked in a mapping or mapplet.

There are various components of Informatica PowerCenter. They are as follows:

1. PowerCenter Repository
2. PowerCenter Domain
3. PowerCenter Client
4. Administration Console
5. Integration Service
6. Repository Service
7. Data Analyser
8. Web Services Hub
9. PowerCenter Repository Reports
10. Metadata Manager

Data Mining is the process of analyzing data from different perspectives and summarizing it into useful information.

Master Data Management (MDM) is a comprehensive method of enabling an enterprise to link all of its critical data to one file, called a master file that provides a common point of reference. When properly done, MDM streamlines data sharing among personnel and departments.

There is huge awareness of MDM. Gartner recently hosted an MDM conference for the first time [piggy-backing on its CRM conference], and they pulled in about 500 attendees.

As to whether they “get it,” it depends on who you’re talking to. Most of the IT people get it. Business users understand the moniker, but they might or might not understand MDM quite as well. I find that business users often require education in terms of what it can do for them and what value it brings. With IT people, it’s a different conversation; they want to know more about the features and how we differentiate ourselves from the competition.

It’s a matter of awareness and the problem becoming urgent. We are seeing budgets increased and greater success in closing deals, particularly in the Pharmaceutical and Financial services industries. Forrester predicts MDM will be $6 billion markets by 2010, which is a 60 percent growth rate over the $1 billion MDM market last year. Gartner forecasted that 70 percent of Global 2000 companies will have an MDM solution by the year 2010. These are pretty big numbers.

Technical folks often have a challenge in data governance in selling the project and getting the funding. Management is looking for a return on investment; they need MDM tied to quantifiable benefits that business leaders understand, like dollar amounts around ROI.

A Data Warehouse is the main repository of an organization’s historical data, its corporate memory. It contains the raw material for management’s decision support system. The critical factor leading to the use of a data warehouse is that a Data Analyst can perform complex queries and analysis, such as data mining, on the information without slowing down the operational systems. Data warehousing collection of data designed to support management decision-making. Data warehouses contain a wide variety of data that present a coherent picture of business conditions at a single point in time. It is a repository of integrated information, available for queries and analysis.

  • Offline Operational Databases – Data warehouses in this initial stage are developed by simply copying the database of an operational system to an off-line server where the processing load of reporting does not impact on the operational system’s performance.
  • Offline Data Warehouse – Data warehouses in this stage of evolution are updated on a regular time cycle (usually daily, weekly or monthly) from the operational systems and the data is stored in an integrated reporting-oriented data structure.
  • Real Time Data Warehouse – Data warehouses at this stage are updated on a transaction or event basis, every time an operational system performs a transaction (e.g. an order or a delivery or a booking etc.)
  • Integrated Data Warehouse – Data warehouses at this stage are used to generate activity or transactions that are passed back into the operational systems for use in the daily activity of the organization.

Dimensional data model concept involves two types of tables and it is different from the third normal form. This concept uses Facts table which contains the measurements of the business and Dimension table which contains the context (dimension of calculation) of the measurements.

PowerCenter is data integration software of Informatica Corporation which provides an environment that allows loading data into a centralized location such as data warehouse. Data can be extracted from multiple sources can be transformed according to the business logic and can be loaded into files and relation targets.

Following are the various components of Informatica PowerCenter,

  • PowerCentre Domain
  • PowerCenter Repository
  • Administration Console
  • PowerCenter Client
  • Repository Service
  • Integration service
  • Web Services Hub
  • Data Analyzer
  • Metadata Manager
  • PowerCenter Repository Reports

A mapping is a set of source and target definitions linked by transformation objects that define the rules for data transformation. Mappings represent the data flow between sources and targets

Standalone Repository : A repository which functions individually and is unrelated to any other repositories.
Global Repository : This is a centralized repository in a domain. This repository can contain shared objects across the repositories in a domain. The objects are shared through global shortcuts.
Local Repository : Local repository is within a domain . Local repository can connect to a global repository using global shortcuts and can use objects in it’s shared folders.

There are various repositories that can be formed with the help of Informatica Repository Manager. They are as follows:
1. Standalone Repository: It is a repository functioning individually as well as is not related to any other repositories.
2. Local Repository: This repository functions within a domain. It is able to connect to a global repository with the help of global shortcuts. Also, it can make use of objects in its shared folders.
3. Global Repository: This repository works as a centralised repository in a domain. It contains shared objects crossways the repositories in a domain.

Powercenter is data integration software of Informatica Corporation which provides an environment that allows to load data into a centralized location such as data warehouse. Data can be extracted from multiple sources can be transformed according to the business logic and can be loaded into files and relation targets.

Dimensional table contains textual attributes of measurements stored in the facts tables. Dimensional table is a collection of hierarchies, categories and logic which can be used for user to traverse in hierarchy nodes.

In every 60 seconds, the hub console is refreshed in the current connection. A lock can be released manually by a user. In case the user switches to other database while having a hold of a lock, then the lock will be released automatically. In case the hub console is terminated by the user, then the lock will be expired after a minute.

It is the main depot of an organisation’s historical data and its corporate memory, containing raw material for the decision support system of management. What lead to the use of data warehousing is that it allows a data analyst to execute complex queried and analysis like data mining on the info without making any slow in operational system. Collection of data in Data warehousing is planned for supporting decision making of management. These warehouses contain an array of data presenting a coherent image of business conditions in time at a single point. Data Warehousing is a repository of information that is available for analysis and query.

A Mapping Parameter is a static value that you define before running the session and its value remains until the end of the session. When we run the session PowerCenter evaluates the value from the parameter and retains the same value throughout the session. When the session run again it reads from the file for its value.

A Mapping Variable is dynamic or changes anytime during the session. PowerCenter reads the initial value of the variable before the start of the session and changes its value by using variable functions and before ending the session its saves the current value (last value held by the variable). Next time when the session runs the variable value is the last saved value in the previous session.

Fact table contains measurements of business processes also fact table contains the foreign keys for the dimension tables. For example, if your business process is “paper production” then “average production of paper by one machine” or “weekly production of paper” would be considered as measurement of business process.

A Mapping variable is dynamic, i.e. it can vary anytime throughout the session. The variable’s initial value before the starting of the session is read by PowerCenter, which makes use of variable functions to change the value. And before the session ends, it saves the current value. However, the last value is held by the variable itself. Next time when the session runs, the value of the variable is the last saved value in the previous session.
>> A Mapping parameter is a static value, defined by you before the session starts and the value remains the same until the end of the session. Once the session runs, PowerCenter evaluates the parameter’s value and retains the same value during the entire session. Next time, when the session runs, it reads the value from the file.

A Mapping parameter is a static value that you define before running the session and it value remains till the end of the session.when we run the session PowerCenter evaluates the value from the parameter and retains the same value throughout the session. When the session run again it reads from the file for its value.

A Mapping variable is dynamic or changes anytime during the session. PowerCenter reads the initial value of the variable before the start of the session and changes its value by using variable functions and before ending the session its saves the current value (last value held by the variable). Next time when the session runs the variable value is the last saved value in the previous session.

Standalone Repository – A repository which functions individually and is unrelated to any other repositories.
Global Repository – A centralized repository in a domain. This repository can contain shared objects across the repositories in a domain. The objects are shared through global shortcuts.
Local Repository – A Local repository is within a domain. Local repository can connect to a global repository using global shortcuts and can use objects in its shared folders.

COBOL source definition
Joiner transformations
Normalizer transformations
Non reusable sequence generator transformations.
Pre or post session stored procedures
Target definitions
Power mart 3.5 style Look Up functions
XML source definitions
IBM MQ source definitions

OLAP is abbreviation of Online Analytical Processing. This system is an application that collects, manages, processes and presents multidimensional data for analysis and management purposes.

There is always a challenge for technical folks in data governance to sell the project and get the fund. There is always a look for ROI by management. They require MDM knotted to quantifiable benefits that are considered by business leaders such as dollar amounts around ROI.

It is a process that helps in analyzing data from several perspectives and also allows summarizing it into helpful information. Informatica MDM Interview Questions

A data movement mode helps in determining how power center server takes care of the character data. Data movement is selected in the Informatica server configuration settings. There are two different data movement modes available in informatica. They are:

** Unicode Mode and ASCII Mode
** Explain OLAP.
** OLAP stands for Online Analytical Processing. It processes as an app helps that gathers, manages, presents and processes multidimensional data for management and analysis purposes.

Following are ways to remove duplicate records

In the source, qualifier use select distinctly
Use Aggregator and group by all fields
Override SQL query in Source qualifier

There are various tables that are linked with staging data in Informatica MDM. They are:

1. Landing Table
2. Raw Table
3. Rejects Table
4. Staging Table

Two types of LOCK are used in Informatica MDM 10.1. They are:
1. Exclusive Lock: Letting just one user to make alterations to the underlying operational reference store.
2. Write Lock: Letting multiple users to make amendments to the underlying metadata at the same time.

1. A repository can be imported and exported to the new environment
2. Informatica deployment groups can be used
3. Folders/objects can be copied
4. Each mapping can be exported to XML and then be imported in new environment.

A Mapplet is a reusable object that contains a set of transformations and enables to reuse that transformation logic in multiple mappings.

A transformation is a repository object that generates, modifies or passes data. Transformations in a mapping represent the operations the Integration Service performs on the data. Data passes through transformation ports that are linked in a mapping or mapplet.

  • Foreign keys of dimension tables are primary keys of entity tables.
  • Foreign keys of facts tables are primary keys of Dimension tables.

Data Mining is the process of analyzing data from different perspectives and summarizing it into useful information.

The fact table contains measurements of business processes also fact table contains the foreign keys for the dimension tables. For example, if your business process is “paper production” then “average production of paper by one machine” or “weekly production of paper” would be considered as a measurement of the business process.

A dimensional table contains textual attributes of measurements stored in the facts tables. A dimensional table is a collection of hierarchies, categories and logic which can be used for a user to traverse in hierarchy nodes.

There are two different ways to load data in dimension tables.

  • Conventional (Slow) – All the constraints and keys are validated against the data before, it is loaded; this way data integrity is maintained.
  • Direct (Fast) – All the constraints and keys are disabled before the data is loaded. Once data is loaded, it is validated against all the constraints and keys. If data is found invalid or dirty it is not included in the index and all future processes are skipped on this data.
  • COBOL source definition
  • Joiner transformations
  • Normalizer transformations
  • Non reusable sequence generator transformations.
  • Pre or post session stored procedures
  • Target definitions
  • Power mart 3.5 styles Look Up functions
  • XML source definitions
  • IBM MQ source definitions
  • We can export repository and import into the new environment
  • We can use Informatica deployment groups
  • We can Copy folders/objects
  • We can Export each mapping to XML and import in new environment

A Mapping Parameter is a static value that you define before running the session and its value remains until the end of the session. When we run the session PowerCenter evaluates the value from the parameter and retains the same value throughout the session. When the session run again it reads from the file for its value.

A Mapping Variable is dynamic or changes anytime during the session. PowerCenter reads the initial value of the variable before the start of the session and changes its value by using variable functions and before ending the session its saves the current value (last value held by the variable). Next time when the session runs the variable value is the last saved value in the previous session.