FR: Catalog: Limit developer frustration about sorting packages

Hi Team,

I am one of those developers who took over an existing package because the maintainer no longer cares about his package in order to update it, correct it, make improvements, etc.

My biggest frustration (and probably those of other developers) is that my package (being scoped) is not at the top of the list which limits the use of my package. This is also the case for similar names.

Below is a related discussion:

If I put myself in the end user's shoes, he doesn't want to go through the whole list before finding a suitable package.

Would it be possible to add a "score" so that the relevance option is more relevant?

I saw that there is an API allowing to retrieve this kind of information but it seems to have not been updated recently.

It is possible to take inspiration from it by creating our own algorithm based on the number of downloads, or others.

What do you think?

I changed the category, maybe more appropriate
This is mainly linked to the generation of the catalog and how it's shown in the editor

@knolleary Is it possible to fetch the flows library score? This would avoid using NPM and simplify things :thinking:

Node-RED can only use information provided by the catalog json - so if we can get some agreement on what sorting options make sense we can look at all the steps needed to make it work in terms of updating the catalog json.

The scorecards are a useful point-in-time measure of a node when it was published. But there are other factors that need to be taken into account - such as download count, open issue count, recent activity etc. There isn't a perfect formula for this. There is also the cost of tracking and maintaining these things - recalculating the scores over time.

As the catalog is updated every half hour, is it possible to implement an algorithm?
Have to implement a lot of logic, I'm aware of it but I think it's worth it. I can start working on it, if you're up for it?

And do remember - scheduled NPM fetches have stopped (a while ago)
so I'm not sure how affective this will all be unless you re-introduce the scheduled pull from NPM.

Personally, I think some of the original mechanisms need reviewing first?

The catalog generation is super lightweight pulling information from the flows database. With 4750 nodes in the catalog it would not be suitable to be making n*4750 additional requests of other services each time the catalog updates.

Before you start implementing anything I think we need to align on what information is needed in order to generate some sort of score and then how that information is gathered on a periodic basis and stored in the flow library database.

This reminds me there appears to be an issue with the existing task that refreshes the download counts - it doesn't look like nodes are getting refreshed properly...

No need to check all the modules each time, if the version is identical and the release is less than three months old, skip.

Yeah, we should actually agree on what information is needed

  • Maintenance

    • Releases frequency
    • Commits frequency
    • Has few issues open
  • Quality

    • Version > 1.0.0
    • Engine (Node)
    • Is deprecated version?
    • Outdated dependencies?
    • Few dependencies?
    • Has Readme
    • Has License
  • Popularity

    • Downloads Count
    • Downloads Constant?
    • Stars Count
    • Forks Count
    • Contributors Count

What about those in Scorecard?

Could you provide a reasoning for that proposal, please?

For three months? This is how Snyk.io works

For the reasoning: The catalog updates every half hour, so you don't have to inspect every module every time.

A module must be inspected only if:

  • it's a new one (just added to the catalog)
  • update (new version of the module)
  • tracking (analysedAt is more than three months old)

The analysis searches for all informations, adds score value and adds a analysedAt value so as not to analyze each element each time the script is launched.

1 Like

Hi Team,
I took a look at the flows library repo and... there's going to be some work :sweat_smile:.
The test/exams period is approaching (yes I'm a student) so I don't have time to develop all the logic at the moment :confused:.

But it would already be a big step if we can agree on the list and the weighting of each element. Do you have any suggestions/comments?

Additionally:

  • Will we add Popularity, Quality and Maintenance to the Scorecard (as progress bar?) ?
  • Does all the score parsing logic have to be in node-red-dev in order to leave flow library clean?

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.