Algorithm Question - How do?

working on an interesting problem. I was hoping one of you would be kind enough to point me even to the right area of math in order to solve this

here it is:

I have a very large music collection. Each piece of music has a tag, and a large portion of the songs have been already viewed, and assigned a quality of "good" or "very good" if I like them. If they've been viewed, and have no quality assigned, it means I didn't like them.

Each piece of music also has a set of tags associated with it.

I'm trying to calculate groups of tags which are highly predictive of music quality because I'm tired of shoveling through garbage.

So maybe tag A and tab B independently aren't good indicators of whether videos in their sets are good, but if both of them are present on any given video, it's 1000% more likely to be "very good" than average. It isn't necessarily just pairs of two though, maybe it's groups of 3/4/5/etc.

How do I solve this efficiently? What area of math even is this? Is there a name for problems like this? The emphasis is on efficiency here because the sets I'm dealing with are large.

Other urls found in this thread:

github.com/DanielBatteryStapler/NeuralNetwork
youtube.com/watch?v=VKvZu1eHFsY
playground.tensorflow.org
twitter.com/AnonBabble

has one or more tags*
sorry for mistakes and redundancy, this could've been written in a better way

regression analysis, factor analysis, multivariable analysis

Make a neural network that takes in the tags as inputs (1 for it has that specific tag and 0 if it doesn't), and then make it have one output (1 for very good, 0.5 for good, and 0 for bad). Next, make it learn the set of music pretty well, and then give it a test input of some tags and it will give you a rating on how much you will like it according to the past data it learned.

You can just give it every possible set of tags and then sort them by the output so you can see which set of tags you like the best.

lmao that's my end goal but I'm not far enough along in my education yet
when I can create a neural network to sort my gachimuchi by quality for me, i'll know that i've made it

Neural networks are not that hard to get into. I wrote one a while back in c++ that actually worked. You can just make it with parameters, hand it a set of data and it would train itself. And then, you could put whatever data you wanted through it. It wasn't particularity fast, but it did work correctly(as far as I can tell). I can try to get it working again(I was using dirty windows when I wrote it) if you want, but you might just want to find a library that works better than what some autist wrote in his free time. I'll probably end up reprogramming it anyways, I haven't done anything like this for a while.

I don't really have the background for it, could you recommend any literature?

I know a little about regression analysis from linear algrebra, and a little about multivariable analysis from multivariable calculus, but nothing that would be super useful.
However, I have to ask: would these methods really be useful here? Each unique tag can be thought of as a new variable or dimension, whose only meaningful value would be 0 or 1 (so each variable is discrete rather than continuous). So it's like a super high dimensional problem where each dimension has a value of 0 or 1, I guess.


I'll look into it, actually. Any resources that you'd recommend?

I(neural network goy) don't have any specific resources for making or using neural networks, I didn't read any books or anything when I made mine. I remember finding some random math paper online and using that for reference on how it should work. I did get the old program to work and cleaned up the code a bit and made an example with comments explaining how to use it. You can either use my implementation or use it for reference or something to get an idea on how these things should work. Warning though, I did write this a while ago when I use still learning c++, so it does do thing rather unsafely, incorrect indexes immediately segfaults, but it still works just fine. If you do have some suggestions on how to improve it, please tell me. I am planning on rewriting the whole thing anyways, and trying to get it to run on the gpu for that sweet acceleration.
Heres a link to my github, please don't dox me github.com/DanielBatteryStapler/NeuralNetwork
The example is in main.cpp, and it doesn't use any libraries.

alright I have a vague idea of what I need to do now, but I'm not really understanding the theory of why multiple neurons are needed if they all receive the same inputs. I guess I understand that every edge might have a different weight, but it still doesn't really make sense to me tbh

So do you listen to music or watch it?

nice post dude

found the pedo

no, it's music, often with a visual component.
i wouldn't expect you to understand true art
mp4 related

p-post more

Good luck trying to classify FULL MOTION VIDEO using neural net technology. Once you start, your computer will never finish.

most of the stuff is too large, and i can't embed anything on this board (nice freedom) but there's some stuff i can post i guess

in addition to mp4 related, view the following:
youtube.com/watch?v=VKvZu1eHFsY

i have a thread on >>>/jp/, I won't post gachimuchi on this board out of respect for its SFW status

not what I'm trying to do

So what is it, do you listen to music or do you watch it?

You're either trying to solve a classification or a regression problem, depending on whether you want to sort the videos into 3 separate buckets (classification), or give it a numerical score (regression). Either way, it's pretty much the same under the hood, the only difference is how many output variables you have.

Neural networks are a popular approach to classification problems. Go play around with playground.tensorflow.org to get a bit of an idea of how neural nets work in practice. Basically, it boils down to a whole bunch of tensor multiplications (a vector is a rank 1 tensor, a matrix is a rank 2 tensor, etc). Check out tensorflow if you don't want to roll your own framework. It includes things like fancier optimisation algorithms (pic related) and GPU acceleration for CUDA-compatible cards.

Another option is to use decision trees/forests, which are basically automatically generated trees of if/else statements. Go check out XGBoost for a good gradient boosted decision tree library. The nice thing about decision trees is that they're a lot nicer to analyse that NNs. XGBoost will be able to tell you how much each tag impacts the final decision. With neural networks, it gets a bit harder.

mp4? I generally look at the best quality I can get for the video off mpsyt. Mp4 is a pretty crap video quality, sound wise its not bad though.
I haven't really screwed around with vid quality all that much but I've seen flvs with better video quality than a lot of mp4 shit. Its like transferring a video off someones iphone to a non apple platform. It looks like you took a video underwater without a lens.

now we know who you really are.

...