Jump to content


Photo

FFXIDB Development Discussion


36 replies to this topic

#1 ___

___

    Advanced Member

  • Members
  • PipPipPip
  • 40 posts

    Posted 06 January 2015 - 10:34 PM

    I have a tight work schedule and I don't want to program more after that most the time but I am very experienced in web development.

     

    Looking at FFXIDB though, I'm not sure how much help I would be, probably mostly the unix server and sql.  I know I can't commit much time but I can at least help answer questions or debug the server if it's having problems.

     

    The thing is I used python once, I have several years of php under my belt though as well as javascript, jquery and html/css.  I wish I had more time to give unfortunately I work 7 days a week already :/  That might change so I'll just post this anyway.

     

    Edit:

     

    I see a lot of potential with ffxidb.  If managed right it could be the go to resource over wiki's.  Wiki's are great in that getting content in is easy, but they have tons of pitfalls (namely that any change has to be done by hand as well).

     

    That leaves two problems for ffxidb:

    getting data in and showing it in meaningful ways.

     

    To get data in, we could crowdsource by giving a form page for input.  Another option is mass doing it by hand.  Both are prone to errors.

    Option B and perhaps supplemental would be bots, whether web bots scraping existing sites (though that may anger people who run those sites so maybe not do that without permission), and do just make things to go over dat files and put them in making unique fields for every type of buff and stat finally allowing a site that you can search, sort, filter by job, and any other stat you want for to find the best gear for slot for action easily for example.  Obviously just the effects one would be full of tons of unique fields.  You could structure it with 3 tables:

     

    One for items, containing the item name, item number, and anything else unique to all items.

    One for attributes, which would include a field for type and another for amount.  (STR of Blood Pact Ability Delay or any other effect from enchantments to costume to any other text put on the item).

    One for combining the item id, with the attribute id and if applicable give the amount of the attribute. Example:

    Generic Dagger with an item id of say 10 has +1 agility

    this would generate the following:

     

    Item table:

     

    item_id: 10

    name: Generic Dagger

    dat_id: 10101

     

    Attributes table:

    attribute_id: 1

    name: AGI

    notes: Increases AGI (could give mechanics here or even just have another table for letting attributes be commented on.  in that case we might go with a more drupal like system where everything is a node and give node ids to everything allowing them to be commented on).

     

    Item Attributes Table:

    item_id: 10

    attribute_id: 1

    value: 3

     

     

    Anyway, I set up lots of custom dbs before and a setup like that you could write a very powerful search engine over the top of it, and make every single attribute on the view screen clickable, and be a filter on a power search.  Since there would be thousands of attributes you'd probably want to have an autocomplete function instead of a dropdown or multi select.

     

    Each one could have a min or max value and you could add as many rows of it as you like, using javascript to add a new line of forms...

     

    Now I'm just rambling but I always wondered why there wasn't such a strong database like that built for ffxi.  I guess the prospect of entering in what like 50,000 items by hand that way would be a chore... and no one ever made a bot to do that over the DAT files or something.  Not sure how it'd even be done... possibly OCR if there's no text equivalents.

     

    The last piece after that was in place... would be to get community content on it.  Like you go to the wiki's because they have things like testimonials, who it drops from, how hard it was or even just like "man i spent forever getting Maat's cap and it was worth it even tho it's obsolete now!"  stuff like that you know... a db site devoid of user comments and user submitted info would be incomplete.  So you'd need the whole 9 yards.  The hard part would be without scraping wiki's or other sites... it'd take years to accumulate that. 


    There still would be room however to make things like chrome/firefox plugins that would allow players to use them on bg-wiki and ffxiclopedia and ffxiah that would pull the items from ffxidb and give you links or just fills of info to accentuate their sites.  Even just a greasemonkey script if we set up a web api.

     

    The thing is... all that work I just outlined would probably take me like 500 hours.  And I don't have near that kind of time.  But leaving this in case anyone sees what could be done with ffxidb is pretty great.  Wiki's are fine and all but they aren't very powerful if you wish to make reports or have search tools.

     

    I had one more idea too.  Figuring out the mechanics of this game is very hard without data and making models based on that data.  If we could make a plugin for windower that would let people opt in to share their data it'd be golden.  The thing is... that data would have to be complete or it'd be worthless meaning it has to account for the food used, the items used, the job, the race, the subjob, the buffs, the monster, any debuffs on the monster... and as such you'd have to make sure people can only use it if all logs are displayed... or if you just go to the packet level and make sure that you get all the info and parse it out.  that too would be like a 300+ hour job.  I just think the number of people with time and skills to make stuff like this happen is limited. 


    I must end with a compliment, you guys do a great great job with all of this and I'm very pleasantly surprised by your ability and acumen.  Good job!

     

    Edit 2: sorry for the rant it could be edited much better, my brain can think faster than my fingers keep up and I didn't want to miss anything.



    #2 Iryoku

    Iryoku

      Advanced Member

    • Windower Staff
    • 488 posts

      Posted 07 January 2015 - 04:04 AM

      FFXIDB is almost entirely auto-generated. Items, mobs, and zones are all extracted from the game's dat files. The only thing that really requires any manual work is getting bounding box data for the maps, and even there we've made progress in automating the process. Very little manual work is required to maintain the site, and the goal is to keep it that way. All of the statistics are collected automatically through the guildwork plugin. In fact, much more data is collected than we know what to do with, and that's really where the help is needed. For instance, there's a huge backlog of crafting results that still needs to be processed, analyzed and have a UI designed around so that the data can be presented to users.



      #3 ___

      ___

        Advanced Member

      • Members
      • PipPipPip
      • 40 posts

        Posted 07 January 2015 - 04:20 AM

        I see.  The concern I have for the data, for instance say drop rate on kills... is the data isn't accurate unless you also know the TH level for instance.  And for crafting, you'd need to know the level of the user at the time of the craft, looking at the recent post it seems you have the TH level logged so that's great.

         

        One of the main things I feel that would benefit the item database on the site is having the ability to click the attributes like Accuracy +2 on the Bee Spatha for example and have that then take you to a page of all items that effect accuracy.  From there you could search within results etc.  For search, you'd have the option to add attributes via an auto complete and filter by ranges and sort ascending or descending for as many attributes as you desire.

         

        When I say attribute I mean any item attribute, damage, accuracy, str or other stats, delay, jobs, race.  FFXIAH has some of these available but in very limited faculty.  A true db should be able to search by any attribute, even rare ones, and pull up a "custom report" or "custom search" on all data.  This includes items, drop rates, crafting results, etc.

         

        It sounds like the thing missing from this equation isn't the data itself but an interface on the website to run the selected queries and an interface to show the results?  (How are the item texts stored if I may ask is it all text?  If that is so you'd have to build a parser and give it a list of tokens to build an attribute list automatically.) 

         

        What would you guys think of something like a report building UI where you could pick a data set (item, crafting or drops), give it criteria (such as accuracy and is equipable by a thief for an item or for crafting item synth rate and day and direction faced and if they had on synth help and for drops you could select a mob select an item or all items and select levels of th, or all levels of th).  You could also use the synth data to give a list of synths, and a list of levels they are good to synth at.  ... man there's just too much work to even think about I see why you guys need help haha.  I mean you could also put in npc sell rates, gather gardening info (not sure if guildwork supports that), and so on and so on.

         

        Thanks for reading... not sure I helped much tho.



        #4 Iryoku

        Iryoku

          Advanced Member

        • Windower Staff
        • 488 posts

          Posted 07 January 2015 - 04:38 AM

          Crafting skill is logged for each crafting result. Pretty much any data that is even remotely relevant is logged, even things that have been proven to have no effect on crafting such as heading.

           

          Attributes are difficult because they are part of the item descriptions, which are written by hand by the localization teams. We can parse this, but it's not as easy as we'd like it to be, and there would have to be some manual supervision, since there are some inconsistencies. There are also occasionally errors.



          #5 Arcon

          Arcon

            Advanced Member

          • Windower Staff
          • 1189 posts
          • LocationMunich, Germany

          Posted 07 January 2015 - 07:17 AM

          I would like most of what you mentioned, and much of it was actually planned (in the vaguest sense possible), but like we said, we couldn't find anyone to help with it and my priorities have since shifted, although I would definitely be willing to coach someone else working on it, if someone would want to. *hint* *hint*

           

          However, one thing you mentioned is also one thing I would try to avoid as much as possible: user generated content. It's fine for comments and discussion, that we can implement. But I would much prefer it if we either stick to data mined content entirely or segregate user content into its own section, they should definitely be clearly separated. One of the major flaws of wikis that we tried to circumvent with the creation of the site was user generated error which is quite significant especially in the form of data sets (think drop rates reported by users on Wikia/GE).

           

          As such I would definitely completely avoid scraping any kind of wiki for content. We do not intend to completely replace them either (we link to all major wikis and FFXIAH on all appropriate pages for that reason), but I agree that we can definitely expand in many directions. Examples off the top of my head:

          • Crafting, as mentioned above. From data mining to analysis (craft chances, HQ chances, determine level caps automatically, etc.)

          • NPC locations with display on a map

          • Guild/vendor prices/contents

          • Mob TP move listing (possibly with effects)

          • Battlefield reward tracking

          • Integration with the FFXIDB plugin, especially for map-related info (mob locations, route planner, etc.)

          The UI for searchable items sounds great as well, I would love something like that, although as Iryoku mentioned, there are some inherent problems with that, but I still think it's worth exploring.

           

          Now we just need someone to actually do all of that. How about quitting your job and working on this instead? \o/



          #6 ___

          ___

            Advanced Member

          • Members
          • PipPipPip
          • 40 posts

            Posted 08 January 2015 - 05:48 AM

            Thanks for the info and discussion.  I really appreciate it.  I'll keep it in mind but I really am in no position to take on more work right now that may change at some point but I can't say for sure.  My business is first priority obviously.  And right now I'm stuck working 7 days a week to meet the demands of multiple clients.

             

            Edit: I do get about 3-6 hours a day off right now but I think to stay sane I need to use that to be doing something other than programming.



            #7 JoshK6656

            JoshK6656

              Newbie

            • Windower Staff
            • 6 posts

              Posted 08 January 2015 - 10:25 PM

              I like these two things. Guild prices and items wouldn't be hard. Could essentially collect every item a npc sells whenever someone opens the menu if we added it to the GW plugin. I have the logic to log all this already if we want to snatch it. 

               

              Mob TP moves are also pretty easy, we have this all mapped already, just need a place to store maybe zone id , mob id, category, id (could store magics / tp moves)?

               

              This sounds fun!

               

              • Guild/vendor prices/contents

              • Mob TP move listing (possibly with effects)


              #8 ___

              ___

                Advanced Member

              • Members
              • PipPipPip
              • 40 posts

                Posted 09 January 2015 - 12:14 AM

                I want to make time to at least make a token parser for the product descriptions.  I might use something like Bison to figure out what the tokens are then import the list to python so you guys can use it, or a sql file.  The thing is without human input there's no way to no all of the different "words" so to speak.  Most will fall into the attribute of Blah Blah Stat <+/i><number>... actually screw Bison I don't know it.  I'll just do it with php regex I know how to do that.  If someone can give me about 2000 item descriptions in a multi-line csv file or better yet a sql file I can do something with it.

                 

                Just it would be a bit trial and error and some exceptions will take a human hand because they don't follow any patterns.  But you could get over 90% of them with just a few regex cases then review the list of attributes and know what the exceptions that need coded are.

                 

                edit: PM me it or if that isn't possible PM me and I can give you my email.



                #9 Arcon

                Arcon

                  Advanced Member

                • Windower Staff
                • 1189 posts
                • LocationMunich, Germany

                Posted 09 January 2015 - 06:17 AM

                You can find such a file in Windower/res/item_descriptions.lua, it contains descriptions for all items that are currently available, and some that aren't. Also, what we need is not the DB or list but the script itself, since we'll need to run it after every FFXI update. And the script should be inserting right into our database, although I can adjust that rather easily once you write the rest. Personally I think these will cover almost all items:

                 

                ((?:<token>)|(?:"<token>"))\s?([+-]\d+[%?])

                Example: Ranged Accuracy+7

                 

                (?:(Additional effect( vs. <enemy type>)?):\s)?(Enhances|Augments|Increases|Decreases|Lowers)\s((?:<token>)|(?:"<token>"))(?:\s(effect|duration))?

                Example: Increases "Jig" duration

                 

                (Additional effect( vs. <enemy type>)?):\s((?:<token>)|(?:"<token>"))

                Example: Additional effect: Fire damage

                 

                ("<token>")

                Example: "Mordant Rime"

                 

                (Occasionally|Occ\.)\s((?:<token>)|(?:"<token>"))

                Example: Occasionally deals double damage

                 

                (<effect>):\s(<Any of the above>)

                Examples:

                * Reives: "Save TP"+400

                * Aftermath: Occ. deals double damage

                 

                Let me know if you think there are more than that. Personally I vote that whoever does that experiments with Lua a bit first, since we have access to the resources there and it's easy to quickly debug the algorithm if needed (easy input/output/visualisation), then it can be ported to anything. Eventually I will port it to Python to run on our server.



                #10 ___

                ___

                  Advanced Member

                • Members
                • PipPipPip
                • 40 posts

                  Posted 10 January 2015 - 05:33 AM

                  Great thanks I'll update you when I get some work done on it :)  And thanks for the starting regex!  Looks good :)

                   

                  I was thinking of more general patterns that we could do to keep a list of keywords to a minimum as those need babysat much more during updates.

                   

                  I've never used Lua.  And I don't use Python either.  I can get the regex right in php pretty quick it just uses Perl syntax which is pretty much what all regex parsers use.  Then if you feel the need to translate that to Python you could.

                   

                  I can quickly turn the Lua file into a php array and strip the ja='' (as I'm not touch the japanese with a 10 foot pole I'll break something).

                   

                  Actually worked on it about 20 min got this far:

                  Code:

                  <?php
                  require_once('item_descriptions.php');
                  $attribute_strings = array();
                  $attribute_map = array();
                  #foreach($items as $id=>$desc){
                  	$matches = array();
                  	echo "$items[10628]\n";
                  	preg_match_all("_(\"*[A-Z]+[a-z]*[0-9]*\"*)([\+-:])(\d+%*)_m",$items[10628],$matches); //Simple Attribute Matches
                  	print_r($matches);
                  
                  #}
                  
                  

                   

                  Output:

                  DEF:20 HP+1% MP+1% MND+6 CHR+6
                  Magic Accuracy+4
                  "Magic Atk. Bonus"+4
                  Array
                  (
                      [0] => Array
                          (
                              [0] => DEF:20
                              [1] => HP+1%
                              [2] => MP+1%
                              [3] => MND+6
                              [4] => CHR+6
                              [5] => Accuracy+4
                              [6] => Bonus"+4
                          )
                  
                      [1] => Array
                          (
                              [0] => DEF
                              [1] => HP
                              [2] => MP
                              [3] => MND
                              [4] => CHR
                              [5] => Accuracy
                              [6] => Bonus"
                          )
                  
                      [2] => Array
                          (
                              [0] => :
                              [1] => +
                              [2] => +
                              [3] => +
                              [4] => +
                              [5] => +
                              [6] => +
                          )
                  
                      [3] => Array
                          (
                              [0] => 20
                              [1] => 1%
                              [2] => 1%
                              [3] => 6
                              [4] => 6
                              [5] => 4
                              [6] => 4
                          )
                  
                  )
                  
                  

                   

                   

                  Need to work on spaces.  And a lot else but it's a start!

                   

                  Converting the lua to a php array was just this in vim I have to do it all the time for other stuff for work, just more regex!

                  %s/,ja="[^}]*}/}/g
                  %s/\] =/\] =>/g
                  %s/\[\(\d*\)\] =/\1 =>/g
                  %s/id=[^,]*,//g
                  %s/{en=//g
                  %s/},$/,/g
                  


                  #11 Arcon

                  Arcon

                    Advanced Member

                  • Windower Staff
                  • 1189 posts
                  • LocationMunich, Germany

                  Posted 10 January 2015 - 07:05 AM

                  Converting to Python will be no problem, I can do that.

                   

                  But there shouldn't be much for us to maintain if we use pre-defined tokens, since our resource extractor basically gives us all names that we could need (the names of job abilities, job traits, weapon skills and spells) along with associated meta-data, so we can just go by those. That way we could store more than just the strings, namely actual abilities and spells and link to those, associate them with jobs and so on.

                   

                  The <token> above can just be this:

                  \w+(?: \w+)*?

                   

                  After that we can just match the found result against the list of known abilities, spells, etc.

                   

                  Anyway, looking forward to what you will do with this ;D



                  #12 Iryoku

                  Iryoku

                    Advanced Member

                  • Windower Staff
                  • 488 posts

                    Posted 10 January 2015 - 02:33 PM

                    Actually... parsing the Japanese might be a better idea. They use far fewer abbreviations in the Japanese data, because most item descriptions fit in the available space, and it contains fewer errors, since it doesn't go through the extra translation step by the localization team. Mapping the Japanese text to English should be simple after it's parsed.



                    #13 Arcon

                    Arcon

                      Advanced Member

                    • Windower Staff
                    • 1189 posts
                    • LocationMunich, Germany

                    Posted 10 January 2015 - 03:21 PM

                    Sounds great on paper, but the regex parser would need to be more sophisticated (support UTF-8 at least, and since it's PHP who knows what it might do :D) and an understanding of Japanese along with knowledge of japanese item descriptions would be beneficial. I'd kinda like to do that, actually, but it does sound like more work :X



                    #14 Iryoku

                    Iryoku

                      Advanced Member

                    • Windower Staff
                    • 488 posts

                      Posted 10 January 2015 - 03:54 PM

                      Convert to UTF-16 first! :) There are no surrogate pairs to deal with in the data set, so UTF-16 is sufficient. For the most part the syntax is the same in Japanese. A space separated list of attributes in name+value format (also the enhances:stat format is pretty much the same). The basic stats (STR, VIT, etc.) are also all the same as in english. So the regex you have now will work almost unchanged for 90% of the cases. Defense is the obvious major exception, which is just notated as 防123, with no delimiter.



                      #15 Byrth

                      Byrth

                        Advanced Member

                      • Members
                      • PipPipPip
                      • 85 posts

                        Posted 10 January 2015 - 05:48 PM

                        Should also track monster level (reported via /check).



                        #16 ___

                        ___

                          Advanced Member

                        • Members
                        • PipPipPip
                        • 40 posts

                          Posted 11 January 2015 - 04:42 AM

                          Would be nice if we could get exact monster levels :)

                           

                           

                          As for Japanese I don't know it and I dont' have a UTF compatible editor I tried to get vim to like UTF before and failed after hours of slaving at it.  So I don't think that'd work.  Programming in JP would make it impossible for me.

                           

                          As for a list of traits, it would be incomplete but might be a nice list to cross ref with and see if we have alot of attributes not lining up... things like Blood Pact Ability Delay II -3 would not parse well just with that list as that's only ever listed fully on an item.

                           

                          As for php it's fully compat with just about any encoding.  You just have to have the right library installed.  Which is usu pretty easy it's a common one.

                           

                          I just got slammed with a surprise!  You have to do 2 websites in 1.5 days thing.  So I have to work my butt off will update when able.



                          #17 Byrth

                          Byrth

                            Advanced Member

                          • Members
                          • PipPipPip
                          • 85 posts

                            Posted 11 January 2015 - 04:38 PM

                            The /check packet contains the exact monster level. SE just chose to have the client not display it.



                            #18 Iryoku

                            Iryoku

                              Advanced Member

                            • Windower Staff
                            • 488 posts

                              Posted 11 January 2015 - 05:31 PM

                              Our resources are already in UTF-8, so if you can load that you can do whatever you need to. Knowledge of Japanese is not really required to parse the attributes. For the most part a parser designed for English will work for Japanese, because the syntax is so similar. The major catch is that you can't do [A-Za-z0-9] type things you need to either use . or [\p{L}\p{N}]. For the parts that won't, a quick trip to google translate, and some checking against the English item description should tell you what you need to know. For traits, you could just have it strip off roman numerals with (IX|IV|V?I{0,3}) before trying to match the list.

                               

                              But it's just a suggestion. I'm not really involved with the FFXIDB project, I just happen to work alongside Arcon and Aureus on Windower, and sometimes "help" with FFXIDB.



                              #19 ___

                              ___

                                Advanced Member

                              • Members
                              • PipPipPip
                              • 40 posts

                                Posted 12 January 2015 - 04:33 PM

                                The /check packet contains the exact monster level. SE just chose to have the client not display it.

                                Being able to display that would be good for things like Level ? Holy but not sure how SE would feel about that.

                                 

                                 

                                Our resources are already in UTF-8, so if you can load that you can do whatever you need to. Knowledge of Japanese is not really required to parse the attributes. For the most part a parser designed for English will work for Japanese, because the syntax is so similar. The major catch is that you can't do [A-Za-z0-9] type things you need to either use . or [\p{L}\p{N}]. For the parts that won't, a quick trip to google translate, and some checking against the English item description should tell you what you need to know. For traits, you could just have it strip off roman numerals with (IX|IV|V?I{0,3}) before trying to match the list.

                                 

                                But it's just a suggestion. I'm not really involved with the FFXIDB project, I just happen to work alongside Arcon and Aureus on Windower, and sometimes "help" with FFXIDB.

                                 

                                Hmm, well there are a few weird characters in my php array for the item list on some items.  Likely utf characters of some sort.  Idk why vim is so hard to make utf compat on windows but it is.  And I'm pretty tied to vim as a programming tool I've used it for years and it makes my life so much easier.

                                 

                                Thanks for the suggestions.  I wont' have time to work on it too much this week a client gave me a deadline for this Saturday :/  I'm gonna take a 2 week vacation next month tho so I'd have some time then if not before.



                                #20 Iryoku

                                Iryoku

                                  Advanced Member

                                • Windower Staff
                                • 488 posts

                                  Posted 12 January 2015 - 04:52 PM

                                  Those are probably the elemental icons, we map them to U+E000-U+E007.

                                   

                                  It doesn't really matter what SE thinks. You can accurately determine the level of any mob that gives EXP, and for mobs that don't (are there any more of those now?...) there are ways to figure it out. The /check packet is just the easiest way to figure it out, because SE sends the mob's level along with the EP/DC/T/VT message.






                                  1 user(s) are reading this topic

                                  0 members, 1 guests, 0 anonymous users