Export
Unload data from the system
Let's imagine that you need to unload data from the system to an excel file. For example in one of these:
To do this, we need to take the PIMInput component, which allows us to select objects by a given criterion and export them from the system.
Let's create the following schema:
In order to get from the system the columns that we want to see in the excel file, in our case the goods data, we enter them in the PIMInput schema:
In order for the data to be sent to a specific server in the PIMOutput context, we write the necessary data:
server
- url, the path to your server (each company is given a different path);user
- username, user name which will be used for recording;password
-password, the password of the user.
Write the following options for PIMInput:
The data url, user and password are taken from the context.
entity
- Where the data will be written to. There are different options:
- Item - Object;
- Item Relation - object dependencies;
- Type - Type, Attribute - Attribute;
- Atribute Group - attribute group;
- Relation - dependency;
- User - user;
- Role - the Role;
- List of values - list of values.
(In our job we will use Item
, since we are exporting objects)
where
- what kind of data you want to export. The query format is described in the section "Query Language";
To print all data about products, whose parentIdentifier is man_t_shirt, we write the following:
`In the "query language" you can see that the syntax of the found field uses "", but Talend does not need quotation marks!
order
- in which order to do the export. For example, we can sort by id attribute and in ascending order:
page size
- how many records to get per server request. This can affect export performance, if you are uploading a lot of data, then a small value of this parameter will increase the number of queries and consequently lead to a longer run time.
In order to track errors you can select the Show debug output
option. If this option is selected, the system will output additional debugging information at run time.
To output the data in the form we need, first we write it in the tJavaRow schema:
And then in the tJavaRow itself:
As a result of the scheme, we get an excel file with the data, which is what we wrote in PIMInput and tJavaRow:
Finding files not related to products
Let's imagine that we have already uploaded pictures to the system, but for some reason some of them are not associated with goods.
In order to find all the pictures that are not related to the products, we will create the following scheme:
First let's take the PIMInput component, which allows us to select objects by a given criterion and export them from the system.
In order to get the files we need, let's enter into the PIMInput schema: identifier and name_en
In order for the data to be sent to a specific server in the PIMOutput context, we write the necessary data:
server
- url, the path to your server (each company is given a different path);user
- username, the user name under which the recording will be made;password
-password, user password.
Write the following options for PIMInput:
We take the url, user and password data from the context.
entity
- Where the data will be written to. There are different options:
- Item - Object;
- Item Relation - object dependencies;
- Type - Type, Attribute - Attribute;
- Atribute Group - attribute group;
- Relation - dependency;
- User - user;
- Role - the Role;
- List of values - list of values.
(In our job we will use Item
, since we are exporting objects)
where
- what kind of data you want to export. The query format is described in the section "Query language";
To find files, we put in the condition typeIdentifier - image:
`In the "Query Language" section you can see that the syntax of the found field uses "", but Talend does not need quotation marks!
order
- in which order to export.page size
- how many records to get for one request to the server. This may affect the performance of the export, if you upload a lot of data, a small value of this parameter will increase the number of requests and, accordingly, a longer run time.
In order to track errors you can select the Show debug output
option. If this option is selected, the system will output additional debugging information at run time.
The output in tJavaRow is the identifier and the picture identifier:
And then we write the code in tJavaRow itself as well:
To find the link between the product and the image we use the component PIMRowInput.
The PIMRowInput
component is the same as PIMInput, except that it can accept incoming columns as parameters.
In PIMRowInput we write the entity - Item Relation
.
In the where we write the following condition:
In order to track how many links between files and pictures are not found, after PIMRowInput put two outputs FLOW
- found and Reject
- not found:
After finding all files without links, next for tLogRow you can write a scheme to create links with PIMOutput.
Unload files from the system
In case you need to unload files from the system, create the following scheme:
To get the data from the system, let's start by taking PIMInput, which allows you to select objects by a given criterion and export them from the system.
In order to get the data to a specific server in the context of PIMOutput we write the necessary data:
server
- url, the path to your server (each company is given a different path);user
- username, user name which will be used for recording;password
-password, the password of the user.
Write the following options for PIMInput:
We take the url, user and password data from the context.
entity
- Where the data will be written to. There are different options:
- Item - Object;
- Item Relation - object dependencies;
- Type - Type, Attribute - Attribute;
- Atribute Group - attribute group;
- Relation - dependency;
- User - user;
- Role - the Role;
- List of values - list of values.
(In our job we will use Item
, since we are exporting objects)
where
- what kind of data you want to export. The query format is described in the section "Query language";
In order to unload files that have an image loaded
In where we write the following:
`In the "query language" you can see that the syntax of the found field uses "", but Talend does not need quotation marks!
order
- in which order to export.page size
- how many records to get for one request to the server. This may affect the performance of export, if you upload a lot of data, a small value of this parameter will increase the number of requests and, accordingly, a longer run time.
In order to track errors you can select the Show debug output
option. If this option is selected, the system will output additional debugging information at run time.
In tJavaRow we prepare the data. We need to take the file id and the path on the filesystem where the files will be written to.
We can set the path with context + the original file name.
Now all we need to do to unload the files is to add PIMAssetDownload, the component that directly serves to export files from the system.
As a result of the scheme, the files from the system will be unloaded to the path you specify: