Connector jobs

The connector uses jobs to extract data from Mirakl and push it to Hyperwallet. This section describes how jobs work and how to troubleshoot them.

Job types

The connector executes two types of jobs:

Extract jobs: these jobs extract data from Mirakl and then process the extracted items.
Retry jobs: these jobs extracts data from an internal store of failed items and then process the extracted items.

Extract and retry jobs apply the same logic to process the extracted items. The only difference is the source of the extracted items. This section regarding retries provides more information about the retry processes.

Common aspects

Extract and retry jobs shares most of their logic. This section describes the common aspects of these jobs.

Phases of a job

Jobs splits their execution in different phases:

Extract phase: during this phase the connector retrieves the items to process from the data source (e.g.: Mirakl invoices or previously failed invoices).
Prepare for item processing: during this phase the connector post process the results the data obtained during the extract phase. For example retrieving from source additional information required for the item processing.
Item processing: The connector processes each of the retrieve items. This phase has the following steps:
- Enrich item: the connector enriches the information of the items retrieved during the extraction phase, usually with information obtained during the preparation phase. For example, the connector enriches Mirakl invoices with information from the shop.
- Validate item: the connector checks if the item is ready for processing.
- Item processing: the connector processes the enriched and validated item. For example creating a payment in Hyperwallet from the invoice in Mirakl.

Logs of a job

The connector writes logs following a semi-structured format. The first part of each log line contains internal details from Java:

26-04-2023 12:43:12.548 [HyperwalletMiraklScheduler_Worker-1] INFO   com.paypal.jobsystem.quartzintegration.listener.JobExecutionInformationListener - <Log message>

the first part is the date and time of the log
the second part is the thread name
the third part is the log level.

After that, it comes the log message, which has two parts:

A JSON with the transaction context information. This information is useful to correlate the log message with what type of work the connector is doing.
The log message itself.

The JSON has the following structure:

{
  "id": "ea859859-8d1a-41bb-abdd-fa9138ca4dc2",
  "type": "BatchJob",
  "subtype": "IndividualSellersExtractBatchJob",
  "itemType": "IndividualSeller",
  "itemId":"78771"
}

The following is an example of a log message while the connector is processing a job:

 {"id":"ea859859-8d1a-41bb-abdd-fa9138ca4dc2","subtype":"IndividualSellersExtractBatchJob","type":"BatchJob"} com.paypal.observability.batchjoblogging.listeners.BatchJobLoggingListener - Starting processing of job

The following is an example of a log message while the connector is processing an item inside a job:

26-04-2023 12:43:13.562 [HyperwalletMiraklScheduler_Worker-1] INFO  {"id":"ea859859-8d1a-41bb-abdd-fa9138ca4dc2","subtype":"IndividualSellersExtractBatchJob","itemType":"IndividualSeller","itemId":"78771","type":"BatchJob"} com.paypal.observability.batchjoblogging.listeners.BatchJobLoggingListener - Processing item of type IndividualSeller with id: 78771

Extract jobs

Extraction date calculation

The extract jobs makes a request to Mirakl API to retrieve the entities that have changed since a specific date.

When the connector executes the extract jobs automatically, it calculates the start time for retrieving changes. The initial time is the time of the last successful execution of the job that returned Mirakl entities.

When the connector executes the extract jobs manually, the user must provide the initial time as part of the HTTP request.

The connector has a limit on the number of days to look in the past when retrieving the changed entities. This limit is set using the environment variable PAYPAL_HYPERWALLET_JOB_EXTRACTION_MAXDAYS (defaults to 30).

REST API

The existing jobs can be executed manually through their endpoints. Except for notification retry job, which doesn’t receive any parameter, all endpoints support 2 optional parameters:

delta: When provided for an extract job, the job will only process entities that were updated/created after this date
name : When provided, the job will be given this name

Param Format

Param	Format
`name`	String
`delta`	yyyy-MM-dd’T’HH:mm:ss.SSSXXX

name

String

delta

yyyy-MM-dd’T’HH:mm:ss.SSSXXX

Endpoints:

HTTP Method PATH Job type

HTTP Method	PATH	Job type
`POST`	`/job/sellers-extract`	Individual Sellers extract
`POST`	`/job/professional-sellers-extract`	Professional Sellers extract
`POST`	`/job/bank-accounts-extract`	Bank accounts extract
`POST`	`/job/invoices-extract`	Invoices extract

POST

/job/sellers-extract

Individual Sellers extract

POST

/job/professional-sellers-extract

Professional Sellers extract

POST

/job/bank-accounts-extract

Bank accounts extract

POST

/job/invoices-extract

Invoices extract

See example of valid execution request:

curl --location --request POST 'http://localhost:8080/job/bank-accounts-extract?delta=2020-11-22T11:52:00.000-00:00&name=bankAccountExtractJob'

Retry jobs

The connector has specific jobs for retrying items that have failed during the execution of an extraction job.

When the processing of an item fails (for example an individual seller), the connector stores in a database the information of that item. Retry jobs periodically read the content of that database and reprocess the failed items. Retry attempts for an item are separated in time according to this expression: Item Last Failure Time + (30 minutes * number of attempts). The connector makes a maximum of 5 retries for each item.

The connector executes retry jobs with a higher frequency than standard jobs, since they need to check if it’s time to reprocess an item according to the previous expression. The frequency of retry jobs is configurable with environment variables, but changing the default values isn’t recommended.