Using node-red for RSS / website monitors to replace Huginn

Not sure you are being fair here. The cheerio library is great at analysing and extracting html elements. The html node uses this, albeit in a somewhat limited way, hence the blog post I pointed to earlier in the thread.

The node-red-contrib-nbrowser node gives access to a full headless browser environment so even dynamic pages can be analysed and data extracted.

There are lots of options in JavaScript for sanitising HTML

https://www.startpage.com/do/dsearch?query=npm+javascript+html+sanitize&cat=web&pl=opensearch&language=english

1 Like

No, you're right, I should have worded it differently. Thanks for pointing those out, gives me something to play with in the future some day :slight_smile:

2 Likes

Well ... thank you!
You surely provided me with plenty leads to start and toy around with.

I guess all the sites are this complex and frankly, I am in a little WTF-mode as to how complex it appears to be, based on your descriptions!

For sure did not expect the necessity to have one thing beyond the CSS ID of
ā€œ.sectionparā€

Guess nothing is ever grueling easy. :wink:

Hopefully not all will be, that was a particularly complex page, many will be a lot simpler.

Did you ever find a solution? I'm in the same situation. I need to monitor about 300 RSS feeds. currently have them setup with feed parser which is a pain. Whenever I need to add a feed I have to re-deploy...