JAV Archive Database - PHP/MySQL

Casshern2

Senior Member...I think
Mar 22, 2008
7,042
14,545
113
* Real quick! I promise this will be my last thread that I discuss any of this, you know I've been all over the board talking about this and similar things. Sorry! :D

I guess I am finally willing to release this in the wild and see what happens. I’m hopeful good things will as I know there are those of you who are good at coding, more so than I am. I’ve been a programmer for many years, but, this is my first plunge into PHP as a web technology. I’ve focused more on ASP Classic, VB Script, Visual Basic (6!), Excel VBA, HTML and light scripting for anything work or hobby related. And, yes, I realize those are all antiquated or already obsolete technologies, but, it is what I know.

Many of you will make it look better. Many of you will make it work better. Probably more of you will ask how to make it work on your PC/Mac/Linux box. You are the ones I’m worried about, to be honest, because my answer will be I don’t know. I’ve been using USBWebserver to learn PHP and create this. It is working for me, and that is going to be enough for me for now. Unless someone asks for a super cool feature that I think I would like and can probably figure out how to implement, any wants or suggestions I will leave to the community to figure out. Heck, I’m actually excited to see what others do with this. Believe me, it is basic and does just what I want it to. No doubt it will be made tons better.

What you will need to run this as presented here.

  • Web Server that can run PHP. I’ve read mentioned around here XAMPP, WAMP, LAMP, and some others. I’ve been using USBWebserver.
  • Database. Must be something PHP can read. I’m using a version of MySQL that came with USBWebserver
  • Data. For my example package I have included a database table schema (not really) and one record of data in CSV format, you can import this into the database you use.
  • PHP files. The suite of files I have included are all used to run things for me. They are basic and ugly but those who know their way around PHP should be able to read and improve on them.
  • Images. The way I use images is to always have just one for normal titles and two for VR titles. The full DVD/Blu-Ray cover is all I use, they are thumbnailed using CSS. VR titles are different, they have no front covers, so, there are only smaller “covers” available then the full title image.
  • Folders. I only use two, aside from the root where the PHP files are located. One for the full title covers and one for the smaller VR thumbnails
* If I think of anything I’m missing I’ll have to include it later.

This hobby of ours can be overwhelming at times, with the sheer amount of content out there. There is nothing else like it. Keeping track, for those who care to, can be daunting. And for those who don’t know they need a system to keep track of things, they soon find they do. My only regret with this project is that it isn’t a turnkey system anyone can get and start using. To be sure, if you set it up and run exactly the way I do, then, yes, you can get up and running pretty quick. But, my guess is most will find things lacking, that’s where it will be up to the community to develop such a thing, if you’re comfortable around PHP and the server(s) required for it to run. As you may know if you’ve followed things, my whole goal has been to make something easy to use, but, the need for something cross platform took away much of my arsenal of tools, primarily the languages mentioned above, as they are Windows-based.

In December of 2019 I turned to PHP since it can be run on any platform, being a web technology cable of it. I started learning what I could; this is the end result so far. I’ve tried to put as much comments as I could in the code as to the way and reason I do things. By all means, improve on anything that can be improved on, which will be a lot of it!

EDIT - forgot the images folder, this goes in the root as well. This is not the images for your titles, these are images for the site. You can use the dimensions of the include images to make your own logos or branding. If you need help with that let me know. :D


Wherever you have PHP installed and/or running, extract this to the root.
Shared.rar (228.8 Kb)

This is the images folder, extract to root
images.rar (116.9 Kb)
 

Attachments

Last edited:
A quick example of search and some click-thru for GVG titles

splash.jpg searchGVG.jpg searchGVGresults.jpg GVG484Title.jpg SeriesClick.jpg StarClick.jpg

Same for MMR-AK titles from Spice Visual

searchMMR-AK.jpg searchMMR-AKresults.jpg MMR-AK091Title.jpg StudioClick.jpg StarClick2.jpg

Video of MMR-AK search

MMR-AKsearch.rar (12.2 Mb)
 
Last edited:
  • Like
Reactions: desioner
Looking good, since it doesn't create the tables automatically, what tables do we need to create? I imported the table.csv file and running into some issues most likely due to table names being hardcoded.

Also, looking through the code I recommend not hardcoding the url paths as I had to change them to get it working for me as I have it setup as http://localhost/javarchive/.

EDIT
Found it, tbl_jad!

Now to figure out the following error.
Code:
Notice: Trying to get property 'num_rows' of non-object in D:\xampp\htdocs\javarchive\update.php on line 66
0 results
 
  • Like
Reactions: Casshern2
That is from the update.php file; num_rows is the number of rows returned by the query, so:

$result = $mysqli->query($sql);

if ($result->num_rows > 0) {


If the number of records (rows) returned in $results from the query $qry is greater than zero...do things with each row. Your query is coming back empty. I haven't posted anything about how the updates work quite yet, sorry.

updates.jpg

For the table and URL paths, you'll have to forgive that and make it your own. Like I've said (slightly here, more so on other threads/posts) this is what I use, users will have to tailor to their needs. But, once you do you should be okay if you know PHP. Not sure what you mean by not hardcoding, can you explain? I really only have http://localhost:8080 as the constant everywhere and then the PHP pages. Do you mean for the images, maybe? Everything is in the root and the folders are there for images, you can surely name them something else, but, yes, you'd need to address that on every page they are referenced.

Lifted this from another post:

The other thing would be what users want to track. Below are fields I use, nothing more. And some aren't populated yet either because I just haven't gotten to them (tags) or I don't know where to find them (like all idol title data).
  1. Index [database table index]
  2. DVDCode [this would be like JUFD-100.mkv]
  3. Runtime
  4. Dimensions
  5. FPS
  6. Size
  7. Bytes
  8. Disc [dual layer blu-ray disc the title is stored on]
  9. Code [this would be like JUFD, ASW, MMR-AK)
  10. Star [no idol names yet]
  11. Tags [so far unpopulated]
  12. Studio
  13. Released
  14. Label
  15. BRC [Blu-ray Code, it is a flag to know how to handle thumbnails differently]
  16. Director
  17. Series
  18. Title
  19. Month [Month released, not archived]
  20. Year [Year released, not archived]
  21. VR [flag for VR titles]
  22. DiscGroup [Wow I can't remember what this was for! LOL probably something for searches?]
  23. CW [Custom Width flag, used to handle covers that don't fit the normal layouts for standard DVD or Blu-ray titles]
  24. Bitrate
Now, before some of you chime in and say "But you're missing this field or that field!" I'm actually not. If you don't see it, I don't need it. Most likely you do. :cool::D:p

Here is [a recent] disc I imported in the database. Index is empty, handled by the import. The fields I get later from R18 scrapes are, predictably, Star, Studio, Released, Label, Director, Series, Title.

csv.jpg
Disc display for this import:
BD121.jpg
 
Last edited:
Got it, maybe I'm just missing it but how do I add a title to my collection? Trying one of the options under Updates just gives me that error.

By hardcoded I mean the URLs are all localhost:8080 which works for your setup but may not work for others depending on what program they use. I haven't coded in PHP for years but maybe I'll see if I can make the URL path a variable.
 
As for the URL, running on whatever server you use won't it always be localhost? I don't really know, just asking since if you were on an actual site, say, R18, everything would be r18.com/xxx/xxx. Or maybe I'm not understanding what you mean which is very possible.

Got it, the record entries! No, the Updates links are for updating data from R18 (stars, directors, release dates, etc.) to records already in the database. I have no automated way via the site to do this. I use an Excel VBA application to get the initial title/file data in a CSV file then I import it the MySQL table using PHPMyAdmin that is part of USBWebserver . Whatever database you're using, you'll have to get the data imported however that works.

MySQL_PHPMyAdmin.jpg
 
Yeah the locahost part is fine but the hardcoded port is more of the issue. Like I said not a big deal as it just requires some URL changes which I did. :)

As for importing, I've added an entry but not scraping any data from R18. I added GVG-900 just as a test and its not downloading and updating my table. Do I need to manually add anything else other than the DVD code?

Also, what is the table structure for tbl_codes?
 
Ah, the port, yeah, sorry. Easy to handle?

Importing, right, I just did a quick video of it. By way of demonstration, I removed all scraped data from a specific Glory Quest series. The titles are all devoid of data other than from the file itself.

GVG-583.jpg

When I run the update for a Specific Code I enter gvg and later gvh, it will query the database for any entry with that Code that does not have a series value (just the field I used to query by, really):

$sql = "SELECT DVDCode FROM tbl_jad WHERE Code = '" . $_POST["parm"][1] . "' AND Series = '' ORDER BY DVDCode";

For those records I then get just the actual code and use the tbl_codes table that I'm constantly populating with Code and digital code from R18/DMM.

$sql = "SELECT digitalcode FROM tbl_codes WHERE Code = '" . $dcode . "'";

tbl_codes.jpg

I use the digital code to send to R18 to find the specific correct title to then navigate to and scrape. Believe me, I am going probably the longest and most difficult way to scrape this data, but, it was what I know how to do.


switch ($_POST["parm"][0]) {
// single title needs DVDcode and DigitalCode
case "title":
$html = file_get_contents('https://www.r18.com/common/search/searchword=' . $_POST["parm"][2] . '/');
break;
// entire disc
case "disc":
$html = file_get_contents('https://www.r18.com/common/search/searchword=' . $code . '/');
break;
// all of the same code
case "code":
$html = file_get_contents('https://www.r18.com/common/search/searchword=' . $digitalcode . '/');
}

The code is the first value [parm(0)] being sent to the Update.php file from the Updates.php file. It then uses the digital code returned to use as the searchword in the R18 search. So for this whole process in the video this should have only been the 13 from this specific series that I cleared out, it would ignore all other Glory Quest GVG or GVH titles unless I have any without a series name. Turns out there were. Made the video a bit longer than it could have been. Currently, it won't let me know if it didn't find a title on R18 it just goes to the next. I didn't implement a button to the forms, so, just hit [Enter] to start the process. I've recorded this in real time so the video takes a minute.

SeriesUpdate.mkv (36.2 Mb)
 
Last edited:
  • Like
Reactions: milro
There isn't a complete and consistent way of searching via hyperlinks yet, I'm afraid.

Searching for Ryoko gets me to Ryoko Murakami. On any of her titles there are multiple links to search by. From left to right, Code, Star, Disc, Dimensions.

codelink.jpg starlink.jpg disclink.jpg dimlink.jpg

Dimensions was something I was trying early on since I wanted to see just how many of certain video frame sizes I had going on!

And on the title page there are most but not all of those so far but does include Director as well.

titlelinks.jpg

And, yes, since Director is a field value you can get from a scrape you can search for that as well, like if I searched for the legendary Maglow:

maglow.jpg
 
For anybody that noticed the first row always had more padding than the rest of the rows, I finally found out what that was! In the style.css file, comment out the rows repeat line in the .container section.

.container {
display: grid;
grid-gap: 5px;
grid-template-columns: repeat(auto-fit, minmax(190px, 1fr));
/** grid-template-rows: repeat(auto-fit, minmax(200px, 1fr)); **/
width: 98%;
}
 
I found that a good number of titles that I scraped had incorrect information because of how I built it. I deleted the entire database table that houses titles and started from scratch. I just completed up to disc 6 (BD006). This was necessary so that I can add DVD Codes and their corresponding digital codes from R18. Although, I do find some will still be hit and miss, but, far less than before, apparently. This situation also allowed me to make changes to the Excel VBA application that calls FFMPEG to get data file data from each title. Before I would put inside the folder I was about to (or had) burned to blu-ray to create the CSV file of data for importing. Now, since I'm starting over and scan each disc again, I added the option at runtime to scan my disc tray or the folder it is in.

It is a bit laborious but when I put a new disc in the tray I have to eyeball which I have already included in the tbl_codes table. If I'm missing it, I add it.