They are more like under 500kb. Maybe an average of 450? The math would be about...?Well, we can come to a ballpark figure relatively easily.
Suppose the average size of a cover image is 290 kB. And suppose there are 270,000 such cover images that we want to store. These are both conservative figures but
Then it's a simple matter of 290 * 270,000 = 78,300,000 kB = 76,000 MB = 74 GB.
Why the hell not? Extremely unlikely you will go to a job interview in the IT world and the manager has issue with AV. But if you tell the manager it took you more than 2 days to do the project, he might not respect your skillz.think a project like this would be great practice, although I doubt I could include it in a portfolio lol.
Wait... 270000 vids? That sounds a bit high. On an average day, how many new vids get released? 5-10? Not 30 I think... If we take 10 a day, than 270000 vids is 74 years worth of AV productions!Well, we can come to a ballpark figure relatively easily.
Suppose the average size of a cover image is 290 kB. And suppose there are 270,000 such cover images that we want to store. These are both conservative figures but
Then it's a simple matter of 290 * 270,000 = 78,300,000 kB = 76,000 MB = 74 GB.
Nah... DMM's cover images are about 250-290kb, but Amazon often use a high quality image (the resolution is excellent) that sometimes does push 500kb. I generally use DMM's covers for my local database, not because of file size or low resolution. But DMM's covers are used by almost everyone so it's super fast to scrap them. Whenever there's a vid I really like, I do go to locate it on Amazon and see if there's a high res cover for it.They are more like under 500kb. Maybe an average of 450? The math would be about...?
WOW yes sorry I stand corrected. I had the size of the thumbnails I make for posting things, not the covers, they are indeed mostly less than 200kb or so.Nah... DMM's cover images are about 250-290kb, but Amazon often use a high quality image (the resolution is excellent) that sometimes does push 500kb. I generally use DMM's covers for my local database, not because of file size or low resolution. But DMM's covers are used by almost everyone so it's super fast to scrap them. Whenever there's a vid I really like, I do go to locate it on Amazon and see if there's a high res cover for it.
OK... I've came back to Python. So... the last time I was using Python (and last time I was working on JAV database) was 2014. 4 years gap, I ought to congratulate myself.UPDATE: I've managed to implement the Japanese HTML pages too...
OK... I've came back to Python. So... the last time I was using Python (and last time I was working on JAV database) was 2014. 4 years gap, I ought to congratulate myself.
Anyway... This time around, I follow my bad habit of ignoring my old code and start from scratch. It took surprising short time to rebuild the core functionalities of my old project from scratch. So much easier this time around, thanks to the fancy new modules the Python community have made.
OTOH, the competitions are stiffer now. There are other, better, projects already rolled out. And there are more (mostly are profitable, I assume) websites out there. So I feel it would be silly of me to try to compete.
So once more I'm building a new project for my personal need, and I am only mentioning it here because of one single feature/functionality that perhaps fellow JAV fans might be interested in: advanced search.
E.g. find all creampie vids by any idol when she was under 21 years old. Yeah... that's a kinda crazy example. But the point is it's doable. (no... that particular search hasn't been implemented yet. just throwing it out there to gauge interest level). But something like "all vids starring Julia but not MIDD/MIDE/PPPD is very easy.
As far as I know, no website (including of course Akiba) has yet offered advanced search on JAVs. So... any comments? Indifference? Encouragement?
My bad! How did I miss that feature on R18.
BTW, that advanced search box also proved me wrong on the total number of JAV. So @ldjb was exactly right about more than a quarter millions JAVs. So like I said... 10 new vids per day, 74 years give you 270,000 vids. FREAK! Or watch JAV 24 hours per day, that would take more than 50 years to finish watching every JAV once.
What language are you using for your personal use program??OK... I've came back to Python. So... the last time I was using Python (and last time I was working on JAV database) was 2014. 4 years gap, I ought to congratulate myself.
Anyway... This time around, I follow my bad habit of ignoring my old code and start from scratch. It took surprising short time to rebuild the core functionalities of my old project from scratch. So much easier this time around, thanks to the fancy new modules the Python community have made.
OTOH, the competitions are stiffer now. There are other, better, projects already rolled out. And there are more (mostly are profitable, I assume) websites out there. So I feel it would be silly of me to try to compete.
So once more I'm building a new project for my personal need, and I am only mentioning it here because of one single feature/functionality that perhaps fellow JAV fans might be interested in: advanced search.
E.g. find all creampie vids by any idol when she was under 21 years old. Yeah... that's a kinda crazy example. But the point is it's doable. (no... that particular search hasn't been implemented yet. just throwing it out there to gauge interest level). But something like "all vids starring Julia but not MIDD/MIDE/PPPD is very easy.
As far as I know, no website (including of course Akiba) has yet offered advanced search on JAVs. So... any comments? Indifference? Encouragement?
Sorry, I know 0 about Python. I'll look up a nice IDE for it so I can learn it.It wasn't clear enough? I'm using Python. Compared to 4 years ago, the language (with the recent community-contributed powerful modules) is so much easier to develop now. I do have to spend a bit of time to figure out the new features and dig around for docs since the modules aren't part of the core language so the docs aren't all in one location. But it's not too hard and well worth it. In case the other coders aren't using Python and interested, the new features are:
- native support of unicode str (in fact there's no more 8-bits char or string) (wish I could get back all those hours wasted on fighting unicode Japanese chars on Java/Python etc)
- pandas gives Excel-like table/database independent of SQL (SQL support was available since a long time ago, of course)
- BeautifulSoup4 is much more powerful and reliable than the first gen html parser
- new dev platform are also a bit better than 4 years ago
There are more, but no one's asking...
Well then you have to ask yourself why learn a new language. I try very hard to control my resources (most precious resource is time) on JAV. I worked on my first JAV project because I had a work reason to use (learn) Python. And then I put it on hold (it was quite functional, but I found no partner to work together or take it over), because at that time I stopped needing Python in my work And then the websites that my old project depended on was changed and it broke my program. And I banned myself from fixing it, because my time shouldn't be wasted on it. And now I'm back on Python for work, and my JAV project 2.0 is on.Sorry, I know 0 about Python. I'll look up a nice IDE for it so I can learn it.
If I could make a video fingerprint, I would be making a high 6-figure income at Facebook or Netflix and banging hot chicks every weekend, instead of fapping to pirated JAV.Something like MusicBrainz for porn in general would rock. It would not even be that hard to make it. Copy their source code, change the AcoustiID (audio fingerprint) to some sort of video fingerprint. Let the community submit the fingerprints and metadata. Would take time, money and effort, but is totally possible if someone spear-head it.
How's the project coming along?Well then you have to ask yourself why learn a new language. I try very hard to control my resources (most precious resource is time) on JAV. I worked on my first JAV project because I had a work reason to use (learn) Python. And then I put it on hold (it was quite functional, but I found no partner to work together or take it over), because at that time I stopped needing Python in my work And then the websites that my old project depended on was changed and it broke my program. And I banned myself from fixing it, because my time shouldn't be wasted on it. And now I'm back on Python for work, and my JAV project 2.0 is on.
If you do decide to try out Python, I would be happy to share my experience, (limited) knowledge and even code. You need to learn core (pure) Python (including the standard modules os, re ) and these modules: urllib, BeautifulSoup, pandas.
For IDE, there are several pro choices out there. But my current platform and recommendation for newbies is Enthought Canopy. It's not a pro choice because it lacks (free) debugger, refactoring and project management. (if you know what refactoring means, go with PyCharm, Eclipse or Spyder, otherwise Canopy is plenty enough for you). Canopy's huge advantage is one single installation gives you everything to build a full JAV project (Python platform, IDE with interactive console, and all the modules you would need) not the way a pro might want, but painless for newbie. Also I love it because its footprint is pretty small, I do a lot of JAV projectdevelopment on a nano PC.
If I could make a video fingerprint, I would be making a high 6-figure income at Facebook or Netflix and banging hot chicks every weekend, instead of fapping to pirated JAV.
javlib.findvid(jbus,'アナル', key='title').sort_index(by='aired').iloc[-5:]