OroPlatform Forums

Covering OroPlatform topics, including community updates and company announcements.

Forums Forums OroPlatform OroPlatform – Programming Questions Question about oro_tracking_event.user_identifier

This topic contains 10 replies, has 3 voices, and was last updated by Dima Soroka Dima Soroka 5 years, 4 months ago.

Starting from March 1, 2020 the forum has been switched to the read-only mode. Please head to StackOverflow for support.

  • Creator
    Topic
  • #33825

    Rodolfo
    Participant

    Hello friends!

    Is there any special reason for why you guys created the oro_tracking_event.user_identifier in this format? User-id + email + visitor-id all concatenated? Any plans to change it in separate fields?

    I’m asking because we’re working on some custom reports to get the customer behavior in every magento order. Unfortunately it’s getting slower to filter it using because we have millions of rows on this table;

    • What products did the customer visited before bought something based on visitor-id;
    • What is the first urfref from the visitor-id to define if he comes from Google, Facebook, etc;
    • Did the customer searched for anything before bought it;
    • Track all customer events based on user/email;
Viewing 10 replies - 1 through 10 (of 10 total)
  • Author
    Replies
  • #33826
    Dima Soroka
    Dima Soroka
    Keymaster

    Hi Rodolfo

    It was a first step into web tracking direction, based on this data we are defining proper relations to customer in the upcoming release. This is already available in dev master, you can check it out and let us know if this is covering your request.

    #33827

    Rodolfo
    Participant

    Hello Dima!

    Thank you for your answer.
    I noticed some new entities like ‘oro_tracking_visit’ and also tested the oro:cron:tracking:parse command.

    The user_identifier are still in the same format. If I understood well this new feature will help to provide custom reports using data like: Browser, SO, Desktop VS Mobile.

    If I understood well, to get the events from a specific user, I’ll need to keep my queries on the same way.. using LIKE ‘%%’ passing user-id or email which is very slow though.

    #33828
    Dima Soroka
    Dima Soroka
    Keymaster

    Hi Rodolfo

    No, parsing will update relation between event and customer so you’ll build the query based on it.

    Thanks

    #33829

    Yurii Muratov
    Participant

    Hi, @rbandeira.
    We have new entities TrackingVisit, TrackingVisitEvent and TrackingEventDictionary at the TrackingBundle.
    This tables are collects during oro:cron:tracking:parse command.

    The table TrackingVisit (oro_tracking_visit) has relations to the oro data. This was made by association. More info about associations you can find here https://github.com/orocrm/platform/blob/master/src/Oro/Bundle/EntityExtendBundle/Resources/doc/associations.md.
    For each record, you can get this data by getIdentifierTarget() method. This method described at Oro\Bundle\TrackingBundle\Model\ExtendTrackingVisit. This method can return objects of different types.

    Developers can add custom types of identifiers. For example, for magento integration, it will be magento customer.

    You can find some information about tracking in documentation https://github.com/orocrm/platform/blob/master/src/Oro/Bundle/TrackingBundle/README.md#trackingprocessor.

    How to build queries with oro_tracking_visit table.
    For example, you have magento integration and tracking website for this integration channel, and you want to select the list of magento customers that use Firefox browser at your magento website.
    In this case the code will be something like:

    #33830

    Rodolfo
    Participant

    Hi @yurio and @Soroka

    Thanks so much for your explanations. I’d like to share with you some thoughts that I noticed on my Oro’s instance.

    1) The first run of ‘parse command’ started 1 week ago and didn’t finish yet. My oro_tracking_event has ~14 millions rows. This command are consuming 100% of the Database Server CPU and the navigation on orocrm is very slow; It’d be nice to have a batch size argument on the command line.

    2) The (Tracking Websites/Events) grid is taking 2 minutes to search all events from a specific email. I believe this grid could be rebuild using the new structure since this is the first point of contact of normal users on tracking data. I was talking with @Soroka and I’ll try his idea creating a new column and saving Emails on it. Maybe also Id-User and Id-Visitor all spitted in different columns. I’ll try to rebuild the events grid getting rid of User column and adding the new fields as well.

    3) I don’t know yet if after the parse command finish it will get rid of duplicated data. Since my biggest table now is oro_tracking_events, after the parsing data finish I’ll have 4 tables with ~14M of rows each one and the “same data” resulting in almost 18gb of extra storage for me.

    Thank you,
    Rodolfo

    #33831

    Yurii Muratov
    Participant

    Hi, @rbandeira.

    Thanks for highlight this notices.

    1. Very strange. We test oro:cron:tracking:parse command with 7m of records. The first run of this command was taken about 24 hours.
    Can you provide command log? We should see this log to detect where the problem was.

    2. Yes. With new structure, this query should be faster.

    3. Yes, the oro_tracking_visit_event will contain ~ 14m of records. I think we can do some optimizations for oro_tracking_events table. I think we can delete this row data after oro:cron:tracking:parse command finish.

    #33832

    Rodolfo
    Participant

    Hi @yurio

    This is an example of the output of
    php app/console oro:cron:tracking:parse -e prod –process-isolation

    It’s not generating any error in prod.log but we can see the pattern of 300 rows parsed per minute. So.. if we do the math of what I still have to parse.. 9229600 / 300 = 30765 minutes.. that means 21 days still left.

    My DB is running on Amazon RDS db.m3.medium (1 CPU and 3.75 GB/RAM)

    #33833
    Dima Soroka
    Dima Soroka
    Keymaster

    Hi Rodolfo

    I think your RDS instance is not efficient enough for such Data Size, could you please try to upgrade it to db.m3.xlarge and see how this will change?

    Thanks
    Dima

    #33834

    Rodolfo
    Participant

    Hello Dima and Yurii

    I agree with you that upgrading the instance could help us on this parsing command. But the problem is after the parse finish I’m not sure if I can downgrade again and the cost will be double the actual price. In the normal CPU usage, our instance is running at 8% of CPU capacity, so I’m not sure if it worth upgrade just for one command.

    This is the query that the parsing command is running. It takes 55 seconds here:

    SELECT o0_.id, o0_.urlref AS urlref0, o0_.res AS res1, o0_.id AS id2, o0_.name AS name3, o0_.value AS value4, o0_.user_identifier AS user_identifier5, o0_.url AS url6, o0_.title AS title7, o0_.code AS code8, o0_.parsed AS parsed9, o0_.created_at AS created_at10, o0_.logged_at AS logged_at11, o0_.website_id AS website_id12 FROM oro_tracking_event o0_ WHERE o0_.parsed = 0 ORDER BY o0_.created_at ASC LIMIT 100

    This is the same query but I’m passing the ‘id’ range. It takes 100 milliseconds!

    SELECT o0_.id, o0_.urlref AS urlref0, o0_.res AS res1, o0_.id AS id2, o0_.name AS name3, o0_.value AS value4, o0_.user_identifier AS user_identifier5, o0_.url AS url6, o0_.title AS title7, o0_.code AS code8, o0_.parsed AS parsed9, o0_.created_at AS created_at10, o0_.logged_at AS logged_at11, o0_.website_id AS website_id12 FROM oro_tracking_event o0_ WHERE o0_.id > 7752218 and o0_.id < 7752317

    Other problems that we found:

    1) We don’t have anymore the full ‘Tracking Events’ Entity on reports. Some custom reports that we created using the TrackingEvents + Code as filter aren’t working anymore.

    2) The parsing data is ignoring the columns like: CODE, URL, TITLE. How can we generate reports using these fields? We are using “Campaing Code” from an external tool and not from OroCRM. (ExactTarget)

    Thank you!

    #33835
    Dima Soroka
    Dima Soroka
    Keymaster

    Thanks Rodolfo, we’ll release fixes soon

Viewing 10 replies - 1 through 10 (of 10 total)

The forum ‘OroPlatform – Programming Questions’ is closed to new topics and replies.

You will be redirected to [title]. Would you like to continue?

Yes No