JAV database

CodeGeek

Akiba Citizen
Nov 2, 2010
5,181
1,867
Well, I'm scraping JAV Library already; no reason I couldn't scrape those others. R18 has a pretty weak selection so that one seems like a loser but DMM might be good. I've never even heard of Sougouwiki.

The thing is that JAVLibrary has an unambiguous serial search (they kind of obscure it through the URLs but it's there) whereas I don't see that on DMM. Sougouwiki also doesn't seem to have individual pages for the movies which makes scraping harder.

Actually JAVLibrary is relatively easy to scrape because it has the serial search and it has CSS classes that identify the various information on the page.
I just named them as sources, but not in aspect of scraping. When I wrote that list I didn't have that in mind. And I only know JAVLibrary, but don't hardly use that. But if it helps you... ;)
 

R18.com

Well-Known Member
Jun 29, 2015
349
260
Well, I'm scraping JAV Library already; no reason I couldn't scrape those others. R18 has a pretty weak selection so that one seems like a loser but DMM might be good. I've never even heard of Sougouwiki.

The thing is that JAVLibrary has an unambiguous serial search (they kind of obscure it through the URLs but it's there) whereas I don't see that on DMM. Sougouwiki also doesn't seem to have individual pages for the movies which makes scraping harder.

Actually JAVLibrary is relatively easy to scrape because it has the serial search and it has CSS classes that identify the various information on the page.

It is all coming from DMM.
Go to JAVlibrary and click in a product image to see where is it located. As you can see all the images are from pics.dmm.co.jp so they are just scraping DMM.
R18.com has the same products that DMM has as Digital products. Products are released first on DVD/Blueray and after some months on Digital version
 

pennybags

New Member
Aug 10, 2015
25
3
It is all coming from DMM.
Go to JAVlibrary and click in a product image to see where is it located. As you can see all the images are from pics.dmm.co.jp so they are just scraping DMM.
R18.com has the same products that DMM has as Digital products. Products are released first on DVD/Blueray and after some months on Digital version
I've noticed the image thing but I can't see any way to search DMM by serial number. They have the serial number in the URL but they append things to the front and middle in a way I don't really understand and using the standard search mechanism doesn't work.

As for R18, not really on-topic, but I've bought a couple things from DMM that didn't seem to be available on R18? I don't really mind because I can read Japanese.
 

CodeGeek

Akiba Citizen
Nov 2, 2010
5,181
1,867
I've noticed the image thing but I can't see any way to search DMM by serial number. They have the serial number in the URL but they append things to the front and middle in a way I don't really understand and using the standard search mechanism doesn't work.

As for R18, not really on-topic, but I've bought a couple things from DMM that didn't seem to be available on R18? I don't really mind because I can read Japanese.
The point is that the real movie code is coded into an item number at DMM. So they put a few more numbers in it or a suffix or prefix. And for special versions (e.g. including a signed photo or underwear) they put a "TK" in it. But the original movie code is always parts of it.
 
  • Like
Reactions: R18.com

ding73ding

Akiba Citizen
Oct 25, 2009
2,337
2,092
Hi everyone... I am thinking of starting an experimental website on this very topic: a database for JAV... hmm actually no, it's a database+fan service site dedicated to JAV actresses. But by necessity, there must be a fairly complete JAV database underneath it.

So as this thread turned out to have a sustainable life (approaching 5 years), I assume there's still an unmet demand for an English-friendly JAV database. So I want to do an unscientific survey to see how I should develop this site:

Section 1: what you need?
Tell us (me) what you need from a JAV database, how would you use it?
What offline tool(s) do you use/want/need associated with this database? (e.g. XBMC/Kodi scrapper. However, see below)
What function would you want to perform on my new site, that is impossible or very difficult on existing resources such as here, DMM, Sougouwiki, and social media?
Is language barrier a major problem for you while using these existing resources (DMM etc)?
How much are you interested in an actress' work and personal life outside of AV info (e.g. live appearance at promotion events and cons, her favorite food etc)

Section 2:what would you contribute?
Are you willing to contribute your time and effort into a JAV actress site?
Can you write code for the web? (what languages/framework you are fluent in?)
Are you good at image handling? Web layout?
Are you a good writer? Would you write (at least a few paragraphs at a time) for specific JAV's and/or actresses?
Are you obsessive/huge fanboy for one or more actresses that you would be a "curator" for her? That means spending time (everyday or at least every week) following her activities and write (or copy or translate) suitable materials to her page on our site.
Do you follow one or more actress's social media feed? Can you understand and/or translate at least some of it? Are you willing to spend some time doing this regularly?

Basically I need to judge the potential demand for the website I have in mind, and I need to judge how much help/collaboration I could expect. I already have the code to data-mine DMM (or almost equivalent) for past and new DVD releases, and basic personal data (age/birthday, measurements) and it won't be hard to code a webpage generator for each actress. It would be a little effort but not too hard to code a smart search tool and personalization. But stopping at that... a website that only re-package DMM info (hard data) and maybe more English-friendly and easy/pretty to use, I doubt there's enough to attract eyeballs. I am not doing this for profit, but what's the point if I don't even get say 100 unique visitors a day?

So I figure it needs to have some added value, such as a personal (subjective) touch to it: invite some individuals (obsessed fanboy) to be each actress' curator who will maintain her page and perhaps write an informal biography, introduction/synopsis/review her new vids as they come out. And regular members can add comments and hopefully have a lively conversation. Basically a site for folks who not only want to collect and watch the porn (and do whatever they do while watching) but also is curious about the performers.
 

pennybags

New Member
Aug 10, 2015
25
3
The point is that the real movie code is coded into an item number at DMM. So they put a few more numbers in it or a suffix or prefix. And for special versions (e.g. including a signed photo or underwear) they put a "TK" in it. But the original movie code is always parts of it.
Yes, I agree... so, for that reason, it's not really useful to me unless there's some sort of pattern to the item numbers that I can reliably determine.
 

Casshern2

Senior Member...I think
Mar 22, 2008
7,020
14,460
It is all coming from DMM.
Go to JAVlibrary and click in a product image to see where is it located. As you can see all the images are from pics.dmm.co.jp so they are just scraping DMM.
R18.com has the same products that DMM has as Digital products. Products are released first on DVD/Blueray and after some months on Digital version

I see the word "scraping" but isn't JAVLibrary simply hotlinking? Scraping would mean they grab from, they just reference the dmm server.
 

CodeGeek

Akiba Citizen
Nov 2, 2010
5,181
1,867
Hi everyone... I am thinking of starting an experimental website on this very topic: a database for JAV... hmm actually no, it's a database+fan service site dedicated to JAV actresses. But by necessity, there must be a fairly complete JAV database underneath it.

So as this thread turned out to have a sustainable life (approaching 5 years), I assume there's still an unmet demand for an English-friendly JAV database. So I want to do an unscientific survey to see how I should develop this site:

Section 1: what you need?
Tell us (me) what you need from a JAV database, how would you use it?
What offline tool(s) do you use/want/need associated with this database? (e.g. XBMC/Kodi scrapper. However, see below)
What function would you want to perform on my new site, that is impossible or very difficult on existing resources such as here, DMM, Sougouwiki, and social media?
Is language barrier a major problem for you while using these existing resources (DMM etc)?
How much are you interested in an actress' work and personal life outside of AV info (e.g. live appearance at promotion events and cons, her favorite food etc)

Section 2:what would you contribute?
Are you willing to contribute your time and effort into a JAV actress site?
Can you write code for the web? (what languages/framework you are fluent in?)
Are you good at image handling? Web layout?
Are you a good writer? Would you write (at least a few paragraphs at a time) for specific JAV's and/or actresses?
Are you obsessive/huge fanboy for one or more actresses that you would be a "curator" for her? That means spending time (everyday or at least every week) following her activities and write (or copy or translate) suitable materials to her page on our site.
Do you follow one or more actress's social media feed? Can you understand and/or translate at least some of it? Are you willing to spend some time doing this regularly?

Basically I need to judge the potential demand for the website I have in mind, and I need to judge how much help/collaboration I could expect. I already have the code to data-mine DMM (or almost equivalent) for past and new DVD releases, and basic personal data (age/birthday, measurements) and it won't be hard to code a webpage generator for each actress. It would be a little effort but not too hard to code a smart search tool and personalization. But stopping at that... a website that only re-package DMM info (hard data) and maybe more English-friendly and easy/pretty to use, I doubt there's enough to attract eyeballs. I am not doing this for profit, but what's the point if I don't even get say 100 unique visitors a day?

So I figure it needs to have some added value, such as a personal (subjective) touch to it: invite some individuals (obsessed fanboy) to be each actress' curator who will maintain her page and perhaps write an informal biography, introduction/synopsis/review her new vids as they come out. And regular members can add comments and hopefully have a lively conversation. Basically a site for folks who not only want to collect and watch the porn (and do whatever they do while watching) but also is curious about the performers.
Can't contribute anything to your question. But I also working on a JAV database. But in my case it isn't a website, but a normal program which may later be split in a server and a client component. Or to be more precise: A central database and clients which send their change request to it and receiving updates while each client has its own database. Means everyone can work offline - and put as many queries on its local database as he or she wants. I also plan to include a REST/JSON & XML based interface or API on it so it also can be queried from outside (means from other programs). My free time is very limited, so the progress is also.
 

pennybags

New Member
Aug 10, 2015
25
3
Can't contribute anything to your question. But I also working on a JAV database. But in my case it isn't a website, but a normal program which may later be split in a server and a client component. Or to be more precise: A central database and clients which send their change request to it and receiving updates while each client has its own database. Means everyone can work offline - and put as many queries on its local database as he or she wants. I also plan to include a REST/JSON & XML based interface or API on it so it also can be queried from outside (means from other programs). My free time is very limited, so the progress is also.
Isn't a centralized database gonna be kind of huge to store on each client? Particularly if you want to store images?

I see the word "scraping" but isn't JAVLibrary simply hotlinking? Scraping would mean they grab from, they just reference the dmm server.

I don't see anywhere where you can enter in new movies so I guess it's possible they're just scraping, although they don't have a complete mirror of everything DMM lists.
 

R18.com

Well-Known Member
Jun 29, 2015
349
260
I see the word "scraping" but isn't JAVLibrary simply hotlinking? Scraping would mean they grab from, they just reference the dmm server.

They are "scraping". The get all the data from DMM. Even the movie genre are exactly the same as DMM. All the movies has exactly the same categories as DMM.
 

R18.com

Well-Known Member
Jun 29, 2015
349
260
I've noticed the image thing but I can't see any way to search DMM by serial number. They have the serial number in the URL but they append things to the front and middle in a way I don't really understand and using the standard search mechanism doesn't work.

As for R18, not really on-topic, but I've bought a couple things from DMM that didn't seem to be available on R18? I don't really mind because I can read Japanese.


We are interested also to improve our service and know what the users will like to have at R18.com.
R18.com is just the English version of DMM.co.jp

I see something that the users will like to have at R18.com and we dont have now is:

1- Simple movie ID (easy to search)
2- Actresses personal data. Size, birth etc. (we have this data as we are DMM).

What else is missing?
 

Casshern2

Senior Member...I think
Mar 22, 2008
7,020
14,460
They are "scraping". The get all the data from DMM. Even the movie genre are exactly the same as DMM. All the movies has exactly the same categories as DMM.

I'm sorry, you're right, for the data they are scraping.
 

CodeGeek

Akiba Citizen
Nov 2, 2010
5,181
1,867
Isn't a centralized database gonna be kind of huge to store on each client? Particularly if you want to store images?
[...]
Yes, that's correct: The images take the most space (even if we only speak about the covers). The database won't be small, but nowadays that shouldn't be a problem.
Why do I go for a local database? Two simple reasons:
  1. If the central database is offline you can still use your local one. A few times I couldn't access DMM (when they tried to block their page for foreigners and I guess a few times they had maintenance shutdowns) as well as Sougouwiki (not sure why I couldn't reach them, maybe also maintenance shutdowns). That's very annoying. :(
  2. Maybe you know AniDB. They have a central database. And you can use local clients to access that. They even have a public API. But the problems are always the queries. If a client does to much in a certain amount of time, they will ban you. Of course they have to manage their resources also. And if a few clients bomb their server with requests all the other users will be annoyed as they can't access the service anymore or it will be very very slow. On the side of the users / clients this flooding protection can also be annoying if you have a lot of queries / request for some good reason. :(
    So if your database is local you can do as many requests / queries as you like and it's your problem if your resources are eaten by that. ;)
 

pennybags

New Member
Aug 10, 2015
25
3
We are interested also to improve our service and know what the users will like to have at R18.com.
R18.com is just the English version of DMM.co.jp

I see something that the users will like to have at R18.com and we dont have now is:

1- Simple movie ID (easy to search)
2- Actresses personal data. Size, birth etc. (we have this data as we are DMM).

What else is missing?
Certainly I'd find the movie ID search useful but, like I said, I use dmm.co.jp.
 

JonKurak

New Member
Nov 20, 2015
16
2
Yes, I agree, a movie ID search would be better. However, R18.com is getting a lot better than when it first started. I am into Michiru Morisaki. I keep a tab on her twitter and Ideapockets twitter. That's where most of the best information is to be found. I am going to work on designing a mobile App to go our website if I get some spare time.
 

Joker6969

Member
Sep 3, 2014
75
43
You can actually search by movie id on R18.com. Just use a space instead of "-", so "avop 112" instead of "avop-112". The search matches parts of the content id string, so searching for "avop 001" will give you lots of hits, matching, for example, "avop00117".

Something non-obvious; you can get to the advanced search to filter by categories and such by clicking "Go" next to the search field while leaving the field empty. The advanced search is pretty nice. It has one flaw I happened to bump into; you can't edit or remove the search text and still preserve your selections. If you search for a word you can add and remove categories/studios/actresses etc freely, but the original word can't be changed.
 

billy_z

New Member
Apr 30, 2013
3
0
Can't contribute anything to your question. But I also working on a JAV database. But in my case it isn't a website, but a normal program which may later be split in a server and a client component. Or to be more precise: A central database and clients which send their change request to it and receiving updates while each client has its own database. Means everyone can work offline - and put as many queries on its local database as he or she wants. I also plan to include a REST/JSON & XML based interface or API on it so it also can be queried from outside (means from other programs). My free time is very limited, so the progress is also.

How far along have you got?

I am also thinking of building a REST API just for JAV based on DMM/R18 data. Recently I handled a project scrapping the whole itunes website. I am quite familiar with all the tools.
 

CodeGeek

Akiba Citizen
Nov 2, 2010
5,181
1,867
How far along have you got?

I am also thinking of building a REST API just for JAV based on DMM/R18 data. Recently I handled a project scrapping the whole itunes website. I am quite familiar with all the tools.
Unfortunately I didn't have much time. So I'm halfway stuck. But as soon as I have something to show I will do.
 

billy_z

New Member
Apr 30, 2013
3
0
Hi everyone... I am thinking of starting an experimental website on this very topic: a database for JAV... hmm actually no, it's a database+fan service site dedicated to JAV actresses. But by necessity, there must be a fairly complete JAV database underneath it.

I am interested in working on this. I am a PHP coder. Have you started building the database?

If not, I am going ahead to build one this holiday season.
 

CodeGeek

Akiba Citizen
Nov 2, 2010
5,181
1,867
I am interested in working on this. I am a PHP coder. Have you started building the database?

If not, I am going ahead to build one this holiday season.
I'm a Java code also I have some coding experience in PHP. But that was long ago.
My client is completely in Java and I also planning to do that for the server side (so I can reuse components). But for the first step it will be a client-only program.
I have finished setting up the structure of the database and already begun with the user interface.

But why don't you start a project on your own? "Competition is good for business", right? ;)