Over the past 5+ years I have built an extensive aircraft tracking/decoding website around Node-RED.
As time goes on, I am spending more and more time writing custom function block code to decode each type of string / text message.
Its got to the point where most of my time is spent getting substring and indexOf wrangling to extract each part of each message.
In short, it seems to me that this approach is unsustainable as each aircraft and airline has subtle differences in message structure.
It was suggested to me over the weekend that I should look at machine learning to model the information and have it extract the data I want out of the text strings.
Reviewing the forums and flows library there seems to be a few nodes that might work, but before I go down the rabbit hole, I was wondering if its even the right approach.
Here is a small example of the raw strings I am working with.
00:16:0808-11-20AES:AE022AGES:022.60032S!20FINI/ID60032S,BLUE01,ZPR008A21308/MR1,2/AFLPLA,KPSM/TD071030,10325053
01:30:0808-11-20AES:AE20C7GES:442.77186A!H1G-#MDINI/ID77186A,RCH356,AAM18131E306/MR0,0/AFLPLA,KPSM/TD070705,113038C4
19:37:0708-11-20AES:AE1472GES:822.77180A!H1N-#MDINI/ID77180A,RCH871,AJRF3368F313/MR1,0/AFFJDG,OAIX/TD081130,1059997B
20:49:0908-11-20AES:AE123AGES:822.44128A!H1P-#MDINI/ID44128A,RCH836,AJZA3362C312/MR0,0/AFFJDG,OKAS/TD081245,1245C15B
04:16:0508-11-20AES:AE0580GES:D02.70035B!33F/AMCTACC.INI01081215INITIALIZERCH551MC0035ABR02Y5XD313KDOVLERT081400/
22:33:0708-11-20AES:AE10BFGES:822.10196A!H1D-#MDINI/ID10196A,RCH873,JJRF3361F312/MR0,0/AFRODN,FJDG/TD071355,135525BA
22:47:0908-11-20AES:AE117EGES:822.21112A!H1C-#MDINI/ID21112A,RCH877,JJZA3363C312/MR0,0/AFFJDG,OKAS/TD081345,1345F5F3
07:32:0808-11-20AES:AE0243GES:022.80047S!20G01/Y/MC/0047/08/KIAB/KSVN/1701/0000/SNAP85//6A9E
Its hard to highlight the data I want extracted in the forums, but in short,
Get 6 characters after 'AES:'
Get 6 characters after the first dot.
(They are the super easy ones).
Get the callsign (its in different places), but as an example, BLUE01, RCH356, RCH871, RCH551 etc. (Note, its not always 6 characters, it can often be 7).
Then the really tricky stuff, the airport codes, they are usually toward the end of the message.
eg, first message, LPLA and KPSM
Fifth message is a bit tricky, but I need KDOV and LERT.
Lastly, I need the date and time, in the first message its at the end '07' and '1030'.
Fifth message I need '08' and '1400'.
As I said, I have been doing a LOT of JavaScript string code in function blocks that is fed each type and structure of each message type from upstream switch blocks and more functions blocks to test and direct each message format to the correct 'decoder'.
I have hundreds and hundreds of raw and decoded messages, so feel I could 'feed' a machine learning system pretty good and teach it what I want.
Note that the example messages are very small subset of messages, I have about 5-7 main message sources each with a few dozen message sub types.
I would really like to stay in the Node-RED eco system, but can branch out-decode-branch in if need be.
Thanks for your thoughts.