Comparing Large Data Set

interscope101 · 15 November 2022 03:03

Got a challenge. Need to identify any cryptocurrency with a change in price ( > 5%) within a certain timeframe ( 5min) in a large set (around 2400).

I can do an API call to get the full list of currency data .

{"symbol":"TPT/USDT","high":0.0093782,"low":0.0069001,"bid":0.008059,"ask":0.0080912,"vwap":0.007671787591489715,"close":0.0080734,"last":0.0080734,"percentage":10.95,"baseVolume":11113331.365821,"quoteVolume":85259.117672419,"info":{"currency_pair":"TPT_USDT","last":"0.0080734","lowest_ask":"0.0080912","highest_bid":"0.008059","change_percentage":"10.95","change_utc0":"0.73","change_utc8":"4.83","base_volume":"11113331.365821","quote_volume":"85259.117672419","high_24h":"0.0093782","low_24h":"0.0069001"}}

I can filter it down a bit by selecting only those with a USDT base.
5 min later I do another call and will need to compare change in price (payload.payload["TPT/USDT"].last) of all currencies and send all that are >5% to a table.

THANKS IN ADVANCE FOR ANY HELP!!

bakman2 · 15 November 2022 06:44

I can do an API call to get the full list of currency data .

Is that the only thing that the API can deliver ?

interscope101 · 15 November 2022 06:46

What do you mean only thing

bakman2 · 15 November 2022 06:47

I can filter it down a bit by selecting only those with a USDT base.

Is this not something you can request from the API directly ?

hominidae · 15 November 2022 06:58

So, where is the bottleneck on your side then?
Query-response time of the two individual API calls, size of the data, duration on compute the function in javascript to perform the analysis? 5mins window seems like a lot headroom for this, for me.

Maybe there is an option to store the data from each API call in a time-series database, then do the diff with a query to that database? A native DB processor is more powerful that a function running in javascript.
I'd have a look at influxdb...store each price value along with a proper tag to identify the currency.

interscope101 · 15 November 2022 07:03

I was hoping I could but didn't find that option.

interscope101 · 15 November 2022 07:06

The bottleneck is my limited scope of expertise - everything you're describing is outside of it. Are you suggesting that 5 min is not sufficient to do the analysis?

There is a python wrapper for this API, so I suppose another approach is to do the call and analysis on my server in python and bring the results into node-red - however that introduces its own complexity.

bakman2 · 15 November 2022 07:09

To be frank; 2400 elements is nothing for comparing arrays, don't need databases.

You can use context to store the api "previous" results, do a foreach loop on the incoming payload, perform the calculation return the output and store the payload as "previous".

interscope101 · 15 November 2022 07:14

I always hit some wall dealing with arrays. Could you possibly share a script that I could test? I basically just need two grab two variables; symbol: payload.payload["TPT/USDT"].symbol (the ones with USDT base) and price: payload.payload["TPT/USDT"].last)

bakman2 · 15 November 2022 07:19

I dont know which api you are using, but if you use finnhub, they have all the TA information directly available in the api, it is free. I am using coingecko api myself.

hominidae · 15 November 2022 07:32

Ah, OK...no, I actually thought, that you were thinking a 5mins timeframe would not suffice....5mins should be plenty for that.

interscope101 · 15 November 2022 07:33

I am using cctx calling either on gate.io or binance exchanges. I just took a look at finhub. It's the same data, they all offer the 24 hour change, it's going to be the same task to get my 5 min interval regardless which API.

hominidae · 15 November 2022 07:35

Well, I agree technically. But could be worth considering for testing, especially when API calls/call rates are limited....as well as for backup/restore

interscope101 · 15 November 2022 07:35

5 min is the price change period I want to test (not related to processing time).

bakman2 · 15 November 2022 08:13

Uhm, finhubb has all technical analysis available in multiple resolutions, 1,5,15,30m etc, check the section for TA > aggregate indicators for example. It also has pattern recognition available for those same resolutions.

interscope101 · 15 November 2022 08:16

I've never seen "price change" as a TA indicator

bakman2 · 15 November 2022 08:32

Correct, because it is not an indicator, but a signal (buy/sell/neutral)

interscope101 · 19 November 2022 08:32

Still hoping someone can assist. Here is a crude example (using one object) of what I am trying to accomplish. I have msg-resend node calling API every 3 minutes, calculate price change (%), and if its greater than threshold send record to table (along with an extracted symbol name). I imagine this can be done much more efficiently with a jsonata / change node.

I cannot to figure out how to do this with a multiple object array. The array should be first filtered down (records only above a certain price, above certain volume, and only with USDT base). Records that meet the threshold should be sent sequentially with some delay (say 1 sec apart). Thanks in advance.

msgAlert={}

msg.payload = msg.payload.payload

let coin = msg.payload["GALA/USDT"].symbol

let currency = coin.replace("/USDT", "")

let prev = flow.get('prev') || 0;

let latest = msg.payload["GALA/USDT"].last

let gain = (((latest - prev)/prev) * 100).toFixed(2);   

flow.set('prev', latest);

msgAlert.payload = {"currency": currency,  "gain": gain}

return [msgAlert]

I have a choice of two different array format options (not sure it makes a difference)

knoepsche · 19 November 2022 14:43

Computing an array of 3000 items is to much? I don't think so.

According your first post you have 5 minutes between each message. This should be more than enough to compute the whole array.

It might be wrong but the dangeroust thing is that computing 1 array of about 3000 items in a function node at once, blocks everything else and takes to long. So I think you could measure this duration in milliseconds by:

msg.timeStart = new Date().valueOf();
...your code...
msg.timeEnd = new Date().valueOf();
return msg

I took an example on my engine and there filtering an array of 3000 items took about 1 millisecond:

[
    {
        "id": "7792bb09a854baa0",
        "type": "inject",
        "z": "839d658c3816e9e7",
        "name": "Timestamp start",
        "props": [
            {
                "p": "topic",
                "vt": "str"
            },
            {
                "p": "timeStart",
                "v": "",
                "vt": "date"
            },
            {
                "p": "timeEnd",
                "v": "",
                "vt": "date"
            }
        ],
        "repeat": "",
        "crontab": "",
        "once": false,
        "onceDelay": 0.1,
        "topic": "Laufzeit messen",
        "x": 180,
        "y": 1440,
        "wires": [
            [
                "d846c10eef3bf71f"
            ]
        ]
    },
    {
        "id": "d846c10eef3bf71f",
        "type": "function",
        "z": "839d658c3816e9e7",
        "name": "Array mit 3000 Items",
        "func": "msg.topic = \"Messung 1.1\"\nmsg.timeStart = new Date().valueOf();\nmsg.payload = new Array(3000).fill().map((el, idx) => { return {id: idx.toString(), value:Math.random()}});\nmsg.timeEnd = new Date().valueOf();\nreturn msg;",
        "outputs": 1,
        "noerr": 0,
        "initialize": "",
        "finalize": "",
        "libs": [],
        "x": 460,
        "y": 1440,
        "wires": [
            [
                "a05badbc4295ae2c",
                "d289f19cea9dec5f"
            ]
        ]
    },
    {
        "id": "a05badbc4295ae2c",
        "type": "debug",
        "z": "839d658c3816e9e7",
        "name": "Array Generated",
        "active": false,
        "tosidebar": true,
        "console": false,
        "tostatus": false,
        "complete": "true",
        "targetType": "full",
        "statusVal": "",
        "statusType": "auto",
        "x": 470,
        "y": 1400,
        "wires": []
    },
    {
        "id": "d289f19cea9dec5f",
        "type": "function",
        "z": "839d658c3816e9e7",
        "name": "Filter Array",
        "func": "msg.topic = \"Messung 1.1\"\nmsg.timeStart = new Date().valueOf();\nlet myArray = msg.payload;\nif (context.get('lastArray') !== undefined) {\n    myArray = myArray.filter((el, idx) => { \n        if (el.value > context.get('lastArray')[idx].value * 1.05) {\n            return el;\n        }\n    });\n}\ncontext.set('lastArray', msg.payload);\nmsg.payload = myArray;\nmsg.timeEnd = new Date().valueOf();\nreturn msg;",
        "outputs": 1,
        "noerr": 0,
        "initialize": "",
        "finalize": "",
        "libs": [],
        "x": 810,
        "y": 1440,
        "wires": [
            [
                "6b9e12e16a0e114d"
            ]
        ]
    },
    {
        "id": "6b9e12e16a0e114d",
        "type": "debug",
        "z": "839d658c3816e9e7",
        "name": "Array Filtered",
        "active": true,
        "tosidebar": true,
        "console": false,
        "tostatus": false,
        "complete": "true",
        "targetType": "full",
        "statusVal": "",
        "statusType": "auto",
        "x": 820,
        "y": 1400,
        "wires": []
    }
]

Perhaps you could do the same. As a workaround you could split the computing into chunks of 100 items and this way give node red the choice to compute other nodes between. An example:

[
    {
        "id": "5bf81ce259910494",
        "type": "inject",
        "z": "839d658c3816e9e7",
        "name": "Timestamp start",
        "props": [
            {
                "p": "topic",
                "vt": "str"
            },
            {
                "p": "timeStart",
                "v": "",
                "vt": "date"
            },
            {
                "p": "timeEnd",
                "v": "",
                "vt": "date"
            }
        ],
        "repeat": "",
        "crontab": "",
        "once": false,
        "onceDelay": 0.1,
        "topic": "Laufzeit messen",
        "x": 160,
        "y": 1780,
        "wires": [
            [
                "6e9696f91d3fc4c1"
            ]
        ]
    },
    {
        "id": "6e9696f91d3fc4c1",
        "type": "function",
        "z": "839d658c3816e9e7",
        "name": "Array mit 100000 Items",
        "func": "msg.topic = \"Messung 1.1\"\nmsg.timeStart = new Date().valueOf();\nmsg.payload = new Array(10000).fill().map((el, idx) => { return {id: idx.toString(), value:Math.random()}});\nmsg.timeEnd = new Date().valueOf();\nreturn msg;",
        "outputs": 1,
        "noerr": 0,
        "initialize": "",
        "finalize": "",
        "libs": [],
        "x": 450,
        "y": 1780,
        "wires": [
            [
                "4bb669bcfe173505",
                "38a8075ae5618b92"
            ]
        ]
    },
    {
        "id": "4bb669bcfe173505",
        "type": "function",
        "z": "839d658c3816e9e7",
        "name": "Messung 1.2 (Split in 100er Chunks)",
        "func": "msg.timeStart = new Date().valueOf();\nconst arrPart = []\nlet startIdx;\nlet endIdx;\n\nif(msg.trigger === undefined){\n    context.set('lastArray', context.get('array'));\n    context.set('array', msg.payload);\n    msg.parts = {};\n    msg.parts.type = \"array\";\n    msg.parts.count = Math.ceil(msg.payload.length/100);\n    msg.parts.len = 100;\n    msg.parts.index = 0;\n    msg.parts.id = msg._msgid;\n    msg.arraylength = msg.payload.length;\n    startIdx = msg.parts.index;\n}else{\n    startIdx = msg.parts.index * 100\n}\n\n//Unterteilung in 100er chunks\nendIdx = startIdx + 100;\nif(msg.arraylength < endIdx){\n    endIdx = msg.arraylength;\n    msg.parts.len = endIdx - startIdx;\n}\n\nfor(let i=startIdx; i<endIdx; i++){\n    //************************************************************\n    //HIER DIE AUFGABE, WAS FÜR JEDEN ITEM GEMACHT WIRD!!!!!!\n    if(msg.parts.index==0){\n        msg.computeArrayStart = new Date().valueOf();\n    }else{\n        msg.computeArrayEnd = new Date().valueOf(); \n    }\n    if(context.get('lastArray')===undefined){\n        arrPart.push(context.get(\"array\")[i]);\n    }else{\n        if (context.get('array')[i].value > context.get('lastArray')[i].value * 1.05) {\n            arrPart.push(context.get(\"array\")[i]);\n        }\n    }\n    //************************************************************\n}\nmsg.parts.len = arrPart.length;\n\nmsg.payload = arrPart;\nmsg._msgid = RED.util.generateId();\nmsg.timeEnd = new Date().valueOf();\nreturn msg;",
        "outputs": 1,
        "noerr": 0,
        "initialize": "",
        "finalize": "",
        "libs": [],
        "x": 890,
        "y": 1780,
        "wires": [
            [
                "375d5a507181218e",
                "d67ba09735431465"
            ]
        ]
    },
    {
        "id": "375d5a507181218e",
        "type": "function",
        "z": "839d658c3816e9e7",
        "name": "Msg Control (While Do)",
        "func": "msg.timeStart = new Date().valueOf();\nif(msg.parts.index < msg.parts.count-1){\n    let triggerMsg = RED.util.cloneMessage(msg);\n    triggerMsg.trigger = true;\n    triggerMsg.parts.index++;\n    return[msg, triggerMsg];\n}\nmsg.timeEnd = new Date().valueOf();\nreturn [msg, null];\n\n",
        "outputs": 2,
        "noerr": 0,
        "initialize": "",
        "finalize": "",
        "libs": [],
        "x": 910,
        "y": 1700,
        "wires": [
            [
                "d3a8aa33af2e8fe3",
                "84db9fc580780309"
            ],
            [
                "4bb669bcfe173505"
            ]
        ]
    },
    {
        "id": "d3a8aa33af2e8fe3",
        "type": "debug",
        "z": "839d658c3816e9e7",
        "name": "Chunk Triggered",
        "active": false,
        "tosidebar": true,
        "console": false,
        "tostatus": false,
        "complete": "true",
        "targetType": "full",
        "statusVal": "",
        "statusType": "auto",
        "x": 930,
        "y": 1660,
        "wires": []
    },
    {
        "id": "38a8075ae5618b92",
        "type": "debug",
        "z": "839d658c3816e9e7",
        "name": "Array Generated",
        "active": false,
        "tosidebar": true,
        "console": false,
        "tostatus": false,
        "complete": "true",
        "targetType": "full",
        "statusVal": "",
        "statusType": "auto",
        "x": 450,
        "y": 1740,
        "wires": []
    },
    {
        "id": "d67ba09735431465",
        "type": "debug",
        "z": "839d658c3816e9e7",
        "name": "Chunk Splitted",
        "active": false,
        "tosidebar": true,
        "console": false,
        "tostatus": false,
        "complete": "true",
        "targetType": "full",
        "statusVal": "",
        "statusType": "auto",
        "x": 940,
        "y": 1840,
        "wires": []
    },
    {
        "id": "84db9fc580780309",
        "type": "join",
        "z": "839d658c3816e9e7",
        "name": "",
        "mode": "auto",
        "build": "object",
        "property": "payload",
        "propertyType": "msg",
        "key": "topic",
        "joiner": "\\n",
        "joinerType": "str",
        "accumulate": "false",
        "timeout": "",
        "count": "",
        "reduceRight": false,
        "x": 1230,
        "y": 1680,
        "wires": [
            [
                "55fb5e2b92f8dfcc"
            ]
        ]
    },
    {
        "id": "55fb5e2b92f8dfcc",
        "type": "debug",
        "z": "839d658c3816e9e7",
        "name": "Computed Array",
        "active": true,
        "tosidebar": true,
        "console": false,
        "tostatus": false,
        "complete": "true",
        "targetType": "full",
        "statusVal": "",
        "statusType": "auto",
        "x": 1440,
        "y": 1680,
        "wires": []
    }
]

Another thing you could do is to use https://flows.nodered.org/node/node-red-contrib-mp-function. This node creates a worker thread parallel to your one threaded node red.

The next thing would be to open a worker thread in your browser app and not to compute on server side. Just handle the array over to your GUI and do your work from this side.

TotallyInformation · 19 November 2022 15:02

If you want a javascript solution, have a look at my Wiser node.js library (not the node-red wiser node) - it uses a comparitor library that will compare two javascript objects of arbitrary complexity and output the differences. I use it to compute the differences between two calls to the Wiser smart heating API.

Topic		Replies	Views
JSONATA Array Filtering General	27	179	8 November 2024
Cryptocurrency Exchange Array (searching, tabling, etc) General	21	1058	3 March 2022
Crypto alert with dashboard General	10	194	23 March 2024
API call via NodeJS General	8	317	9 February 2023
Http Request JSON General http-request	13	721	25 December 2021

Comparing Large Data Set

Related topics