INFORMATICA

Friday, 9 March 2012

Informatica Exceptions – 3

Here are few more Exceptions:

1. There are occasions where sessions fail with the following error in the Workflow Monitor:

"First error code [36401], message [ERROR: Session task instance [session XXXX]: Execution terminated unexpectedly.] "

where XXXX is the session name.

The server log/workflow log shows the following:

"LM_36401 Execution terminated unexpectedly."

To determine the error do the following:

a. If the session fails before initialization and no session log is created look for errors in Workflow log and pmrepagent log files.

b. If the session log is created and if the log shows errors like

"Caught a fatal signal/exception" or

"Unexpected condition detected at file [xxx] line yy"

then a core dump has been created on the server machine. In this case Informatica Technical Support should be contacted with specific details. This error may also occur when the PowerCenter server log becomes too large and the server is no longer able to write to it. In this case a workflow and session log may not be completed. Deleting or renaming the PowerCenter Server log (pmserver.log) file will resolve the issue.

2. Given below is not an exception but a scenario which most of us would have come across.

Rounding problem occurs with columns in the source defined as Numeric with Precision and Scale or Lookups fail to match on the same columns. Floating point arithmetic is always prone to rounding errors (e.g. the number 1562.99 may be represented internally as 1562.988888889, very close but not exactly the same). This can also affect functions that work with scale such as the Round() function. To resolve this do the following:

a. Select the Enable high precision option for the session.

b. Define all numeric ports as Decimal datatype with the exact precision and scale desired. When high precision processing is enabled the PowerCenter Server support numeric values up to 28 digits. However, the tradeoff is a performance hit (actual performance really depends on how many decimal ports there are).

Exceptions in Informatica – 2

Let us see few more strange exceptions in Informatica

1. Sometimes the Session fails with the below error message.

"FATAL ERROR : Caught a fatal signal/exception

FATAL ERROR : Aborting the DTM process due to fatal signal/exception."

There might be several reasons for this. One possible reason could be the way the function SUBSTR is used in the mappings, like the length argument of the SUBSTR function being specified incorrectly.

Example:

IIF(SUBSTR(MOBILE_NUMBER, 1, 1) = ‘9′,
SUBSTR(MOBILE_NUMBER, 2, 24),
MOBILE_NUMBER)
In this example MOBILE_NUMBER is a variable port and is 24 characters long.
When the field itself is 24 char long, the SUBSTR starts at position 2 and go for a length of 24 which is the 25th character.
To solve this, correct the length option so that it does not go beyond the length of the field or avoid using the length option to return the entire string starting with the start value.
Example:
In this example modify the expression as follows:
IIF(SUBSTR(MOBILE_NUMBER, 1, 1) = ‘9
′,
SUBSTR(MOBILE_NUMBER, 2, 23),
MOBILE_NUMBER)
OR
IIF(SUBSTR(MOBILE_NUMBER, 1, 1) = ‘9
′,
SUBSTR(MOBILE_NUMBER, 2),
MOBILE_NUMBER).
2. The following error can occur at times when a session is run
"TE_11015 Error in xxx: No matching input port found for output port OUTPUT_PORT TM_6006 Error initializing DTM for session…"
Where xxx is a Transformation Name.
This error will occur when there is corruption in the transformation.
To resolve this do one of the following: * Recreate the transformation in the mapping having this error.
Three times you get the below problems,
1. When opening designer, you get "Exception access violation", "Unexpected condition detected".
2. Unable to see the navigator window, output window or the overview window in designer even after toggling it on.
3. Toolbars or checkboxes are not showing up correctly.
These are all indications that the pmdesign.ini file might be corrupted. To solve this, following steps need to be followed.
1. Close Informatica Designer
2. Rename the pmdesign.ini (in c:\winnt\system32 or c:\windows\system).
3. Re-open the designer.

Exceptions in Informatica

There exists no product/tool without strange exceptions/errors, we will see some of those exceptions.

1. You get the below error when you do "Generate SQL" in Source Qualifier and try to validate it.
"Query should return exactly n field(s) to match field(s) projected from the Source Qualifier"

Where n is the number of fields projected from the Source Qualifier.

Possible reasons for this to occur are:

1. The order of ports may be wrong

2. The number of ports in the transformation may be more/less.

3. Sometimes you will have the correct number of ports and in correct order too but even then you might face this error in that case make sure that Owner name and Schema name are specified correctly for the tables used in the Source Qualifier Query.

E.g.,

2. The following error occurs at times when an Oracle table is used
"[/export/home/build80/zeusbuild/vobs/powrmart/common/odl/oracle8/oradriver.cpp] line [xxx]"

Where xxx is some line number mostly 241, 291 or 416.

Possible reasons are:

1. Use Data Direct Oracle ODBC driver instead of the driver Oracle in

2. If the table has been imported using the Oracle drivers which are not supported, then columns with Varchar2 data type are replaced by String data type and Number columns are imported with precision Zero(0).

3. You might get the below error while trying to save a Mapping..
"Unexpected Condition Detected"

Warning: Unexpected condition at: statbar.cpp: 268

Contact Informatica Technical Support for assistance

When there is no enough memory in System this happens.

To resolve this we can either

1. Increase the Virtual Memory in the system

2. If continue to receive the same error even after increasing the Virtual Memory, in Designer, go to Tools -> Options, go to General tab and clear the "Save MX Data" option.

TC_0002.EXP_AUTH@TIMEP

Friday, 2 December 2011

Informatica and Stored Procedures

A. Described below is a scenario where the requirement is to have a stored procedure that returns a cursor as a source. PowerCenter does not support a stored procedure that returns a cursor as a source. The workaround for this is

1. The procedure that will load the data to a new table:

CREATE OR REPLACE procedure load (p_initial_date in date, p_final_Date in date) as

str_load varchar2 (500);
str_clean varchar2 (500);
begin
str_clean:= ‘DELETE FROM EMP’;
str_load:= ‘INSERT INTO EMP select * from EMPLOYEE where DOJ between trunc

(p_initial_date) and trunc (p_final_Date) ‘;
execute immediate str_clean;
execute immediate str_load;
EXCEPTION
WHEN OTHERS
THEN
ROLLBACK;
end load;

2. Create the table that will receive the data from the procedure:

SQL> create table EMP as SELECT * from EMPLOYEE where 1 > 2;

3. Add a Store Procedure transformation to the PowerCenter mapping. This transformation will execute this new procedure called as LOAD on this example.

4. Set the run method to be Source Pre Load, to be executed before read the source table.

5. Import the EMP table as a Source Definition. This table will be populated by the new Store Procedure.

A. If the original store procedure is used by the customer application and you can’t change the source code, you can create a new store procedure that call the original one (without inserting into a table), and execute the insert on the new table executing a loop on the returned cursor.

B. Given below is a situation where you wanted to pass a mapping variable to a stored procedure transformation (it can either be connected or unconnected).

Connected Stored Procedure

The parameters that are passed to a connected Stored Procedure have to be linked from another transformation.
Given below are the steps to pass mapping variable to a connected Stored Procedure transformation:

Create an Expression transformation.
Create an output port in the Expression transformation with the following expression:

$$mapping_variable

This sets the value of this output port to the mapping variable.

Link this output port to the Stored Procedure transformation.

Unconnected Stored Procedure

For unconnected Stored Procedure transformations you can use the mapping variable in the expression calling the stored procedure.
Follow the steps below to pass mapping variable to a unconnected Stored Procedure transformation:

Create an Expression transformation.
Create an output port in the Expression transformation with the following expression:

: SP.GET_NAME_FROM_ID ($$mapping_variable, PROC_RESULT)

In case if you are attempting to use a mapping variable to store the output value of the stored procedure, the session will fail with the below error. “TE_7002 Transformation Parse Fatal Error; transformation stopped: invalid function reference. Failed to Initialize Server Transformation.”

To resolve the issue replace the mapping variable with the PROC_RESULT system variable.

Example:

--Incorrect, using a mapping variable:

:SP.PROCEDURE(FIELD1, $$mapping_variable)

Correct, using the PROC_RESULT system variable:

:SP.PROCEDURE(FIELD1,PROC_RESULT)

:SP.PROCEDURE($$mapping_variable,PROC_RESULT)

The PROC_RESULT system variable assigns the stored procedure output to the port with this expression.

You are welcome to share your thoughts/suggestions to inflate any “Infa Tip” -

Monday, 14 November 2011

Process Control / Audit of Workflows in Informatica

1. Process Control – Definition

Process control or Auditing of a workflow in an Informatica is capturing the job information like start time, end time, read count, insert count, update count and delete count. This information is captured and written into table as the workflow executes

2. Structure of Process Control/Audit table

The table structure of process control table is given below,

Process Control structure

PROCESS_RUN_ID	Number(p,s)	11	A unique number used to identify a specific process run.
PROCESS_NME	Varchar2	120	The name of the process (this column will be populated with the names of the Informatica mappings.)
START_TMST	Date	19	The date/time when the process started.
END_TMST	Date	19	The date/time when the process ended.
ROW_READ_CNT	Number(p,s)	16	The number of rows read by the process.
ROW_INSERT_CNT	Number(p,s)	16	The number of rows inserted by the process.
ROW_UPDATE_CNT	Number(p,s)	16	The number of rows updated by the process.
ROW_DELETE_CNT	Number(p,s)	16	The number of rows deleted by the process
ROW_REJECT_CNT	Number(p,s)	16	The number of rows rejected by the process.
USER_ID	Varchar2	32	The etl user identifier associated with the process.

3. Mapping Logic and Build Steps

The process control flow has two data flows, one is an insert flow and the other is an update flow. The insert flow runs before the main mapping and update flows runs after the main mapping, this option is chosen in “Target Load Plan”. The source for both the flows could be a dummy source which will return one record as output, for example select ‘process’ from dual or select count(1) from above Table. The following list of mapping variable is to be created,

Mapping Parameter and variables

$$PROCESS_ID

$$PROCESS_NAME

$$INSERT_COUNT

$$UPDATE_COUNT

$$DELETE_COUNT

$$REJECT_COUNT

Steps to create Insert flow:

1. Have “select ‘process’ from dual” as Sequel in source qualifier
2. Have a sequence generator to create running process_run_Id ’s
3. In an expression SetVariable ($$PROCESS_RUN_ID,NEXTVAL), $$PROCESS_NAME to o_process_name, a output only field
4. In an expression assign $$SessionStarttime to o_Starttime, an output only field
5. In an expression accept the sequence id from sequence generator
6. Insert into target’ process control table’ with all the above three values

Process Control Image after Insert flow

PROCESS_RUN_ID	1
PROCESS_NME	VENDOR_DIM_LOAD
START_TMST	8/23/2009 12:23
END_TMST
ROW_READ_CNT
ROW_INSERT_CNT
ROW_UPDATE_CNT
ROW_DELETE_CNT
ROW_REJECT_CNT
USER_ID	INFA8USER

Steps in main mapping,

1. After the source qualifier, increment the read count in a variable (v_read_count) for each record been read in an expression and SetMaxVariable ($$READ_COUNT,v_read_count)

2. Before the update strategy of target instances, do the same for Insert/Update/Delete counts; all the variables are now set with all their respective counts

Steps to create Update flow:

1. Have “select ‘process’ from dual” as Sequel in source qualifier
2. Use SetMaxvariable to get the process_run_id created in insert flow
3. In an expression assign $$INSERT_COUNT to an o_insert_count, a output only field, assign all the counts in the same way
4. In an expression assign $$SessionEndtime to o_Endtime, an output only field
5. Update the target ‘Process Control Table’ with all the above three values where process_run_id equals the process_run_id generated in Insert flow

Process Control Image after Update flow

PROCESS_RUN_ID	1
PROCESS_NME	VENDOR_DIM_LOAD
START_TMST	8/23/2009 12:23
END_TMST	8/23/2009 12:30
ROW_READ_CNT	1000
ROW_INSERT_CNT	900
ROW_UPDATE_CNT	60
ROW_DELETE_CNT	40
ROW_REJECT_CNT	0
USER_ID	INFA8USER

4. Merits over Informatica Metadata

This information is also available in Informatica metadata, however maintaining this within our system has following benefits,

Need not write complex query to bring in the data from metadata tables
Job names need not be mapping names and can be user friendly names
Insert/Delete/Update counts of all as well as individual target can be audited
This audit information can be maintained outside the metadata security level and can be used by other mappings in their transformations
Can be used by mappings that build parameter files
Can be used by mappings that govern data volume
Can be used by Production support to find out the quick status of load

You are welcome to share your thoughts/suggestions to inflate any “Infa Tip” -

crontab -e

bash

$crontab –l > filename.txt

$export EDITOR=vi

crontab –e

Data Integration Challenge – Capturing Changes

When we receive the data from source systems, the data file will not carry a flag indicating whether the record provided is new or has it changed. We would need to build process to determine the changes and then push them to the target table.
There are two steps to it

Pull the incremental data from the source file or table
Process the pulled incremental data and determine the impact of it on the target table as Insert or Update or Delete

Step 1: Pull the incremental data from the source file or table

If source system has audit columns like date then we can find the new records else we will not be able to find the new records and have to consider the complete data
For source system’s file or table that has audit columns, we would follow the below steps

While reading the source records for a day (session), find the maximum value of date(audit filed) and store in a persistent variable or a temporary table
Use this persistent variable value as a filter in the next day to pull the incremental data from the source table

Step 2: Determine the impact of the record on target table as Insert/Update/ Delete

Following are the scenarios that we would face and the suggested approach

Data file has only incremental data from Step 1 or the source itself provide only incremental data

do a lookup on the target table and determine whether it’s a new record or an existing record
if an existing record then compare the required fields to determine whether it’s an updated record
have a process to find the aged records in the target table and do a cleanup for ‘deletes’

Data file has full complete data because no audit columns are present

The data is of higher

have a back up of the previously received file
perform a comparison of the current file and prior file; create a ‘change file’ by determining the inserts, updates and deletes. Ensure both the ‘current’ and ‘prior’ file are sorted by key fields
have a process that reads the ‘change file’ and loads the data into the target table
based on the ‘change file’ volume, we could decide whether to do a ‘truncate & load’

The data is of lower volume

do a lookup on the target table and determine whether it’s a new record or an existing record
if an existing record then compare the required fields to determine whether it’s an updated record
have a process to find the aged records in the target table and do a clean up or delete

You are welcome to share your thoughts/suggestions to inflate any “Infa Tip”

Search This Blog