SuccessChanges

Summary

  1. folder structure, minimal worksheet and object, nt file in ressource (commit: e4c7714) (details)
  2. copied rdf reader to see if the more complex techstack works, and it does after changing project to jdk 8 (commit: 6c2cccc) (details)
  3. Outsource readin of rdf file into own function from initial run method. make function verbose (commit: 56993cb) (details)
  4. naive first methods for semantic similarity estimation, not runable: need to be corrected (commit: b903364) (details)
  5. some code cleanup and getting at least random evaluation running to see if workflow works (commit: 7fbfe30) (details)
  6. start triple transformation such that we will get dataformat which fits spark implemented min hash for jaccard sim (commit: 38136e0) (details)
  7. start collecting features, tests with some print lines to ensure code methods do the right stuff (commit: 20d632c) (details)
  8. collect all features to later transfer them to indexes, this works and is tested by test calls and print lines this correlates to step 4 (commit: 4707589) (details)
  9. created feature map for features to int as step five in minhash pipeline (commit: 542d084) (details)
  10. val for total number features as step six, needed for feature representation (commit: 1585843) (details)
  11. baseline tryout scala script for minhash feature representation transformation process. needed to see how scala functions and data types can be used. works on strings instead of nodes with uri. many main ideas are working (commit: 4f4a39c) (details)
  12. transformation of rdf data to needed min hash data is done, some example code snippets from spark minHash run through. self written Vectorization used (commit: f1f8785) (details)
  13. dense scala functions for naive transformation of triple representation to feature map (commit: c309471) (details)
  14. new sample and started with more dense code for data transformation (commit: c231b2e) (details)
  15. created idea for nodeIndexer class for feature transformation (commit: 1c7bc95) (details)
  16. created node indexer as class to get from node to int and vie versa. tested in minhash tryout. structure inspired by spark vectorizers (commit: c178237) (details)
  17. use new nodeindexer class to transform key to int (commit: 1bdec41) (details)
  18. call and show result of node indexer (commit: 5f9a383) (details)
  19. remove test point class (commit: 95e206b) (details)
  20. main idea for a class to get from triples RDD[Triple] a Map[Node, Iterable[Seq[Node]]] which curresponds to a map from uri to features. currently a problem in transform because of non seriasable but workaround is available for quick sessions (commit: 6d83709) (details)
  21. call node featoure factory in a working mode (commit: 3a0a63f) (details)
  22. A complete pipeline working on text basis inspired by databricks tutorial. (commit: 602f610) (details)
  23. more comments and more verbose, also df show without truncinate (commit: 07543d1) (details)
  24. created small sample nt file with movie idea for documentation and having very short uris (commit: 9e1b0ea) (details)
  25. Pipeline now working more verbose and call are more clean defined, stack is working (commit: 4ebae78) (details)
  26. some new imports, new read in, experiment results as triple output with write out. export strangly as folder (commit: 1b04138) (details)
  27. added all feature creation modes, added comments to explain some thoughts of development, changed save procedure, changed strings for final exported nt file (commit: c187350) (details)
  28. added outputs as comments like in stackoverflow and added better usage of sparse vector reusage in approxNearNeigh (commit: 0a755aa) (details)
  29. added comments for corresponding command line outputs. especially the dataframe repesentations so code explains more whats happening without need to run it (commit: bea5497) (details)
  30. implemented in pipeline a jaccard similarity calculation (commit: 90c125e) (details)
  31. added some more comments (commit: 4798645) (details)
  32. first idea for a module in a pipeline to generate from rdf rdd triples a df (commit: 17dafa8) (details)
  33. some new trz outs and comments but main idea is in over text pipeline (commit: 3612031) (details)
  34. future idea for class RDF minhash (commit: 857ae92) (details)
  35. class that could become part of pipelined semantic similaritz estimation (commit: 4050c3b) (details)
  36. comment because strange kind of test (commit: 953705e) (details)
  37. idea to create DF (commit: 68e0236) (details)
  38. we do not need this anymore. we wont create a central class but multiple. each for one approach (commit: 331a673) (details)
  39. we might create a node indixer in future but currently we reuse the nlp stack inspired pipeline (commit: 906fcbb) (details)
  40. in general this is now an old approach because we switched to nlp based pipeline (commit: dff713b) (details)
  41. added some input descriptions and added rodriguezegenhofer, batet and tversky but some questions to these formulas stored in todos (commit: 34096bc) (details)
  42. new modular feature extractor for similarity estations. tested and working! (commit: c8d96af) (details)
  43. modular pipeline for minhash. spark session start, file readin and feature extraction added and working (commit: 0db5b8c) (details)
  44. ideas for further approaches siliar to jaccard noted (commit: 9644606) (details)
  45. integrated count Vectorizer and made all important parameters a block and not hardcoded in constructer or method calls if they should be avaulable later in cmd call (commit: bc731ec) (details)
  46. added minhash with reformatting columns for easier usability and clearer reading. tested and working (commit: 373ebd2) (details)
  47. older node based approach. deleted because will be replaced by RDF_Feature_Extractor (commit: fea7b85) (details)
  48. older alternative feature extraction approach. deleted because will be replaced by RDF_Feature_Extractor (commit: 296dfbd) (details)
  49. old dummy class. not used anymore (commit: a6fcbac) (details)
  50. module for metagraph factory. the central inforamtion for an experiment got created now for each pair are next up (commit: 625e080) (details)
  51. replaced this tryout workflow bz over text pipeline and clean version will be in run minhash (commit: 51c9b74) (details)
  52. rearranged parameter, removed stringsfrom central uri of experiment, and fixed bug RDD creation. now tested and working (commit: 756615c) (details)
  53. added line for metagraph creeation. working and tested but not implemented the triples for each pair od elements with its similarity value (commit: cff6273) (details)
  54. added creation of triples also for small comparison elements, tested and done (commit: 9c89721) (details)
  55. store rdf representation of minhash similarity assesment. tested and working (commit: a71f4f0) (details)
  56. placeholder which is now replaced by minhash (commit: 20c0000) (details)
  57. added datetime to outputpath so no conflicting output paths should occur (commit: 07990dc) (details)
  58. noted todo that relation might be changed to something different in future (commit: 4cef2ee) (details)
  59. wrote first complete approach for a jaccard modular model corresponding to minhash from spark to make pipeline interchangeable (commit: 9baafb5) (details)
  60. fixed bug that lit cannot hadle Vector but typedLit can. also wrong df was handed over because withColumn works not inplace. now everthing is tested and working for jaccard (commit: 325f3db) (details)
  61. created complete running pipeline for jaccard with minor changes compared to Minhash. only changes needed where in some parameters, especially the minhash vs kaccard parameters and the call of jaccardmodel vs minhashmodel. tested and running. (commit: 5385cd3) (details)
  62. added missing parameter for number hash tables and changed name to more generic name so later superclass can easier handle differences in different simialrity estimators (commit: c1fb902) (details)
  63. remove outcommented parts from minhash (commit: d1d39fa) (details)
  64. created very generic superclass for several similarity estioations so reusablity is enlarged. tversky uses this already. tested and working! (commit: 5d18240) (details)
  65. tversky model implemented as current best example of reusing code for several Similarity estions. tested and working over Tversky in run (commit: ba9be76) (details)
  66. run of tversky with new reusability. tested and working. next up making this pipeline also more reusable in code perspective and reusing generic code (commit: d5df096) (details)
  67. ideas for making pipeline more generic (commit: 59e1c98) (details)
  68. added keep column option to switch between behavior known from spark min hash and behavior which is needed for metagraph creation. also changed esrimation udf to protected (commit: ca9c3eb) (details)
  69. changed complete jaccard to reusing generic similarity estimator code. tested and working. added keep column option to switch between behavior known from spark min hash and behavior which is needed for metagraph creation. also changed estimation udf to protected (commit: 3947af8) (details)
  70. changed jaccard pipeline to more generic one like tversky. tested and working (commit: 5631787) (details)
  71. chanhed all fitting elements to protected. so not callable from outside. also build in keep column for metagraph vs minhash behavior. also added options for filtering and ordering results as intended (commit: c96afaa) (details)
  72. createt batet distance. everthing what is needed is adjusted and now optimized compared to minhashOverpipeline cecause log is now inside udf. alternative still in comment (commit: ba1b6fb) (details)
  73. created braun blaquet model. pretty similar to jaccard (commit: 1706977) (details)
  74. removed non necessary code which was left in comment (commit: 60b8478) (details)
  75. created chiai model can be used as jaccardModel etc (commit: 6ac10a5) (details)
  76. created ROdriguez egenhofer on basis of tversky... can be called similar but now you have only alpha but not betha (commit: 560916f) (details)
  77. created simpson model. similar to braun blanquet. only subsumer different (commit: 63430bd) (details)
  78. tversky pipeline now with keep column option used for now working storage also for nn estiamtion to meta rdf graph (commit: 75ccc0d) (details)
  79. added typed to parameters in the beginning and changed approxNearestNeigbors st. it has another column so metagraph creation can handle it (commit: 97964f0) (details)
  80. added types to parameters (commit: bb325ca) (details)
  81. removed betha which was left from tversky (commit: 60249af) (details)
  82. rename generic superclass so it is alligned with the _Model name domain (commit: 4b75c29) (details)
  83. new error handling and new set opportunitiy. now over this so it is more alligned with the mllib behavior (commit: 336aa1d) (details)
  84. new set opportunitiy. now over this so it is more alligned with the mllib behavior (commit: 14065e1) (details)
  85. new set opportunitiy. now over this so it is more alligned with the mllib behavior and different error handling (commit: fe41671) (details)
  86. new set opportunitiy. now over this so it is more alligned with the mllib behavior (commit: 34f27b9) (details)
  87. different call of set parameters and also some more output for further discussions (commit: bd9d7a6) (details)
  88. different call of set parameters (commit: d78fbf5) (details)
  89. different call of set parameters (commit: 62c2c93) (details)
  90. added dice model as close model to jaccard (commit: 7799390) (details)
  91. added checks for nan bug. also inserted default alpha and beta so devide by zeo can not occur by default. (commit: 6a0e4df) (details)
  92. print lines of all major parts so run can be inspected more easily (commit: f8f5dba) (details)
  93. call of column name checks to look if default values fit if they are not net (commit: cf2447b) (details)
  94. object type specificaition (commit: efc8d73) (details)
  95. added column check method (commit: 5ba4483) (details)
  96. Feature extractor now working on basis of dataframe which is read in by spark.read.rdf (commit: 3c3493c) (details)
  97. rename to model (commit: 439d730) (details)
  98. major pipeline calling for all similarity experiments. started to optmize for sansa server execution. creates csv with important information (commit: a2b7e0f) (details)
  99. optimize featureExtractor for parallel computing (commit: 770d75b) (details)
  100. optimization to run on spark server over cmd line tools (commit: 7e30bcb) (details)
  101. mainly allow hyperparameter evaluation (commit: 1359951) (details)
  102. clearer structure, more aligned to scala camelCase and new handling of default values (commit: f53a3a8) (details)
  103. parameteras are setable over config file. now with cmd calling only argument needed to specify path to config (commit: 2a1d3b4) (details)
  104. not needed anymore becausse the implementation in sansa which was reused here does not allign with paper (commit: b6c5b8e) (details)
  105. some new try outs (commit: 4d079d3) (details)
  106. fixed bugs, alligned scala style and set default value handling (commit: 9a107f9) (details)
  107. started showcasing object for minimal calls needed for example for simpleML (commit: c1ab861) (details)
  108. first placeholder class for possible node indexer which can work as alternative to read in rdf into spark as df (commit: 6709bdb) (details)
  109. chnaged to use shade dependecy which creates huge jar including all needed packages. needed for server export so all imports run as intended (commit: 8257357) (details)
  110. changed default value of features column (commit: b793bc5) (details)
  111. scala camelCase style started (commit: 038305b) (details)
  112. allign model software code design (commit: 2c7822e) (details)
  113. collected all currently implemented similarity estiomations as minimal calls (commit: 2c041b6) (details)
  114. added alpha and beta (commit: 15aded4) (details)
  115. camelcase for alpha and beta (commit: 0a39073) (details)
  116. typo in beta (commit: 0ace453) (details)
  117. changed main class to experiment call (commit: 0f4c19a) (details)
  118. config resolver created by farshad to handle dynamically the file path of local and hdfs paths (commit: a919c19) (details)
  119. move outputpath to config, readin chnages by usage of config resolver so hdfs and local is usable, currently not working (commit: 6c60d80) (details)
  120. created object to test server usage with minimal complexity (commit: 6a8c228) (details)
  121. working evaluation pipeline (commit: 0519fb8) (details)
  122. file lister implemented by farshad so both local and hdfs files are listable (commit: 07147f7) (details)
  123. Fixed wrong method calls in similarity examples (commit: 5ffa145) (details)
  124. Made FeatureExtractorModel inherit frm Transformer (commit: eebee24) (details)
  125. change to CamelCase (commit: 1070594) (details)
  126. added number of runs as parameter for more stable processing times calculation (commit: 7e6c7f9) (details)
  127. minor try outs (commit: 904f594) (details)
  128. changes for camelCase (commit: f2ec52b) (details)
  129. file was only test class (commit: 15dad37) (details)
  130. this was first attempt but is now resolved in more cleaner substructures (commit: 000a9f4) (details)
  131. better place now for certain try outs (commit: ca4d9f5) (details)
  132. tversky added and key retrieval aligned. (commit: 2027a67) (details)
  133. bug fixing if both feature vectors are empty (commit: 86896bf) (details)
  134. if feature vector union emtpty return 0 (commit: 6dea37e) (details)
  135. key generation once in the beginning and length of dataframes in prints (commit: 09bab6c) (details)
  136. added further examples in minimal calls like subset estimation or stacked approaches (commit: a1ad081) (details)
  137. tests with filter option for only considering movies (commit: 0558c53) (details)
  138. added stacked option and give the option to stop at specified points in pipeline to try out parts of pipeline (commit: 41f8653) (details)
  139. some new autoimports (commit: 04d048e) (details)
  140. remove unused comment (commit: bc089d4) (details)
  141. autoimports reorganize and limit dataframe show to 10 rows to test if this ressults in out of memory (commit: d2d9ef6) (details)
  142. removed some command line print typos and rearranged minor things (commit: aecc3a9) (details)
  143. not needed, merged into FeatureExtractorModel. we now have overloaded transform for Dataset/Dataframe and for RDD Triple Node (commit: 831bb80) (details)
  144. added comments aligned to scala doc and limit dataframe output in cmd print (commit: bdfa143) (details)
  145. created docstrings aligned to scala doc and merged overloaded ttanform to make use of RDD Triple Node based read in possible (commit: b0b715d) (details)
  146. created scala doc aligned docstrings (commit: dcd8248) (details)
  147. scala docstrings to describe shortlz what this class is about (commit: 04eaa21) (details)
  148. refactor to camelCase (commit: bc3be7d) (details)
  149. changed the name for one relation (commit: d3a6f7c) (details)
  150. refactor name (commit: 1b10415) (details)
  151. refactor name (commit: 77739ee) (details)
  152. camelcase refactor (commit: ff273bd) (details)
  153. camelcase refactor (commit: f8621ec) (details)
  154. refactor camel case and switch to dataframe from rdd read in (commit: 7f6e75c) (details)
  155. move local[*] to conf and not in code, quick and dirty filtering for hands on try out. started with logging instead of printing (commit: 77c8060) (details)
  156. seperated filter part for relevant uris (commit: 9de95e2) (details)
  157. prepared first ReadMe.md (commit: 5ec68f9) (details)
  158. remove of not needed code (commit: 21e59a5) (details)
  159. renamed package to more suited and camelcase name (commit: b8bdeea) (details)
  160. started with unit tests for semantic simialrity estimation (commit: 2593ad9) (details)
  161. moved spark setup inside each run so overhead is present in each run (commit: a1b3f16) (details)
  162. optimized layout (commit: c7c434f) (details)
  163. tests changed from show to count because it is faster (commit: 4d96d04) (details)
  164. added information s.t. metagraph creator can easily fetch that all are similarity estimators (commit: e428ff4) (details)
  165. changed default column labeling from underscore to camel case (commit: 5f87664) (details)
  166. created minhash on basis of apache spark minhash lsh and added behavior as in other generic similarity estimators to allow better follow up handling of comulmns for e.g. metagraph creation (commit: d0cb5fd) (details)
  167. created novel method for metagraph creation using better literal creation etc and clearer parameters (commit: 2e3a137) (details)
  168. cleaned shape and tryouts for other datasets (commit: f5e6984) (details)
  169. minimal calls now with novel minHash and novel metagraph creation (commit: ad65179) (details)
  170. new default value for column labeling (commit: 6a9de1c) (details)
  171. removed because not needed anymore. we created a new file for full pipeline with all estimators instead having dedicated one for each model (commit: 34d6453) (details)
  172. new approach to have one concise pipeline for easy usage instead of needed construction of each pipeline (commit: ddf28c1) (details)
  173. nin hash depricated and now moved to similarityPipeline (commit: 87e7b6b) (details)
  174. changes for novel metagraph creation (commit: bf3c6d6) (details)
  175. replaces show by count to make tests faster (commit: f5958f2) (details)
  176. added comments and parameters, pipeline is working (commit: f328796) (details)
  177. added another alternative to start main (commit: c9588a5) (details)
  178. jaccard now merge in similarity Pipeline (commit: 9a025f1) (details)
  179. keeps track of experiment number when unning over extensive hyperparamter grid and printing this in cmd line (commit: 8718283) (details)
  180. added num hash tables option (commit: 27a2e54) (details)
  181. extended tests and give each pupeline module its own test (commit: 7dd3931) (details)
  182. scala docs retest (commit: f084a7b) (details)
  183. Create index.md (commit: 76c9cc0) (details)
  184. Create index.html (commit: bc630af) (details)
  185. Update index.md (commit: 3ff4fff) (details)
  186. Delete index.html (commit: 99eafc4) (details)
  187. Update index.md (commit: db60062) (details)
  188. Update index.md (commit: e18a5a7) (details)
  189. Update index.md (commit: 644b19e) (details)
  190. experiments still entry point (commit: 9616163) (details)
  191. added more models to experments and fixed bug in counting total experiments (commit: caf057b) (details)
  192. moved movie file in subfolder (commit: b487432) (details)
  193. sample parameter setup for similarity evaluation (commit: 20275c2) (details)
  194. Update .travis.yml (commit: b9c1099) (details)
  195. Update .travis.yml (commit: 78e6c43) (details)
  196. Update .travis.yml (commit: 5a5f6fb) (details)
  197. Create main.yml (commit: e413943) (details)
Commit e4c7714b4bfde7c66b130ae29778b8918401c07e by carsten.draschner
folder structure, minimal worksheet and object, nt file in ressource
(commit: e4c7714)
The file was addedsansa-ml-spark/src/main/resources/rdf.nt
The file was modified.gitignore (diff)
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/Semantic_Similarity_Estimator.scala
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/test_run_worksheet.sc
Commit 6c2cccc930ed8b754a7ee0c8ba05276a76a54cc0 by carsten.draschner
copied rdf reader to see if the more complex techstack works, and it does after changing project to jdk 8
(commit: 6c2cccc)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/Semantic_Similarity_Estimator.scala (diff)
Commit 56993cb1545ff193a6d3d32e1debe6ddd29247d9 by carsten.draschner
Outsource readin of rdf file into own function from initial run method. make function verbose
(commit: 56993cb)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/Semantic_Similarity_Estimator.scala (diff)
Commit b90336487da91c17d5f5e4977938f829cdd50ae7 by carsten.draschner
naive first methods for semantic similarity estimation, not runable: need to be corrected
(commit: b903364)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/Semantic_Similarity_Estimator.scala (diff)
Commit 7fbfe3012eae53f5c97757b43400567f4b015f87 by carsten.draschner
some code cleanup and getting at least random evaluation running to see if workflow works
(commit: 7fbfe30)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/Semantic_Similarity_Estimator.scala (diff)
Commit 38136e08fc45ff251695a804f3c93a88fb3b1034 by carsten.draschner
start triple transformation such that we will get dataformat which fits spark implemented min hash for jaccard sim
(commit: 38136e0)
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/minHashTryOut.scala
Commit 20d632c834c24dc03f325ea758029f3b6831dc3e by carsten.draschner
start collecting features, tests with some print lines to ensure code methods do the right stuff
(commit: 20d632c)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/minHashTryOut.scala (diff)
Commit 4707589593f7c474fe22e416f0261a8dbd90dc3d by carsten.draschner
collect all features to later transfer them to indexes, this works and is tested by test calls and print lines this correlates to step 4
(commit: 4707589)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/minHashTryOut.scala (diff)
Commit 542d0843be3a2a33318620eb13b230b53dad0158 by carsten.draschner
created feature map for features to int as step five in minhash pipeline
(commit: 542d084)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/minHashTryOut.scala (diff)
Commit 15858431eb69c7d04be9d8c2ebf0ffe68a3a2157 by carsten.draschner
val for total number features as step six, needed for feature representation
(commit: 1585843)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/minHashTryOut.scala (diff)
Commit 4f4a39c5282245117253dd518c7109acd96f129c by carsten.draschner
baseline tryout scala script for minhash feature representation transformation process. needed to see how scala functions and data types can be used. works on strings instead of nodes with uri. many main ideas are working
(commit: 4f4a39c)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/test_run_worksheet.sc (diff)
Commit f1f8785b69aa892ba479a098782c2232ccdebc32 by carsten.draschner
transformation of rdf data to needed min hash data is done, some example code snippets from spark minHash run through. self written Vectorization used
(commit: f1f8785)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/minHashTryOut.scala (diff)
Commit c3094719fde55297beed22d0814dc04276779ad7 by carsten.draschner
dense scala functions for naive transformation of triple representation to feature map
(commit: c309471)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/test_run_worksheet.sc (diff)
Commit c231b2e36e4270a877d7408d7640ac32322560c9 by carsten.draschner
new sample and started with more dense code for data transformation
(commit: c231b2e)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/minHashTryOut.scala (diff)
Commit 1c7bc951b41a662b6589cc2eed0ac9b0603c1f4a by carsten.draschner
created idea for nodeIndexer class for feature transformation
(commit: 1c7bc95)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/test_run_worksheet.sc (diff)
Commit c1782371211db4f226f186bd47d3b2814494c573 by carsten.draschner
created node indexer as class to get from node to int and vie versa. tested in minhash tryout. structure inspired by spark vectorizers
(commit: c178237)
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/utils/NodeIndexer.scala
Commit 1bdec41555926f14192cf04e034478a258c0a8fa by carsten.draschner
use new nodeindexer class to transform key to int
(commit: 1bdec41)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/minHashTryOut.scala (diff)
Commit 5f9a3835ae01598d875d1bd21334af89b51b5788 by carsten.draschner
call and show result of node indexer
(commit: 5f9a383)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/minHashTryOut.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/test_run_worksheet.sc (diff)
Commit 6d8370974f4f20d7a2565e75d5a7373b8d670563 by carsten.draschner
main idea for a class to get from triples RDD[Triple] a Map[Node, Iterable[Seq[Node]]] which curresponds to a map from uri to features. currently a problem in transform because of non seriasable but workaround is available for quick sessions
(commit: 6d83709)
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/utils/NodeFeatureFactory.scala
Commit 3a0a63f1b746426e5447b5f488077e7ae4f17e5f by carsten.draschner
call node featoure factory in a working mode
(commit: 3a0a63f)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/minHashTryOut.scala (diff)
Commit 602f610422173642f228b1fcf0c67f3e383380c6 by carsten.draschner
A complete pipeline working on text basis inspired by databricks tutorial.
(commit: 602f610)
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/minHash_rdf_over_text_pipeline.scala
Commit 07543d1828fd6ab02e17e822089c503986b4765d by carsten.draschner
more comments and more verbose, also df show without truncinate
(commit: 07543d1)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/minHash_rdf_over_text_pipeline.scala (diff)
Commit 9e1b0ea122421f3a3eaee7b0e2a26c316c6a60ab by carsten.draschner
created small sample nt file with movie idea for documentation and having very short uris
(commit: 9e1b0ea)
The file was addedsansa-ml-spark/src/main/resources/movie.nt
Commit 4ebae781c2ebf421d53b720439529cd42c219925 by carsten.draschner
Pipeline now working more verbose and call are more clean defined, stack is working
(commit: 4ebae78)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/minHash_rdf_over_text_pipeline.scala (diff)
Commit 1b04138c1815e2d1254e26490298d21027f251be by carsten.draschner
some new imports, new read in, experiment results as triple output with write out. export strangly as folder
(commit: 1b04138)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/minHash_rdf_over_text_pipeline.scala (diff)
Commit c1873508a75f3c7a42e88272f297ab868102e527 by carsten.draschner
added all feature creation modes, added comments to explain some thoughts of development, changed save procedure, changed strings for final exported nt file
(commit: c187350)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/minHash_rdf_over_text_pipeline.scala (diff)
Commit 0a755aa8fa6d60d8e87b80efb42f195424e55de5 by carsten.draschner
added outputs as comments like in stackoverflow and added better usage of sparse vector reusage in approxNearNeigh
(commit: 0a755aa)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/minHash_rdf_over_text_pipeline.scala (diff)
Commit bea549743329f81dbfee30fdd3c2ab6e01b87112 by carsten.draschner
added comments for corresponding command line outputs. especially the dataframe repesentations so code explains more whats happening without need to run it
(commit: bea5497)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/minHash_rdf_over_text_pipeline.scala (diff)
Commit 90c125e8b16331f8ef514e47115d2bda2e9459a9 by carsten.draschner
implemented in pipeline a jaccard similarity calculation
(commit: 90c125e)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/minHash_rdf_over_text_pipeline.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/minHash_rdf_over_text_pipeline.scala (diff)
Commit 17dafa846dc5901e30c06c44801d46bbfdc9beb7 by carsten.draschner
first idea for a module in a pipeline to generate from rdf rdd triples a df
(commit: 17dafa8)
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/utils/FeatureDataframeGenerator.scala
Commit 361203117eb351c2038ebd64148786dedae40c0b by carsten.draschner
some new trz outs and comments but main idea is in over text pipeline
(commit: 3612031)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/minHashTryOut.scala (diff)
Commit 857ae9227de13da33a439eb06c82e80e3a4cfb27 by carsten.draschner
future idea for class RDF minhash
(commit: 857ae92)
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/RDF_MinHashLSH.scala
Commit 4050c3b829914de59ef9ff37f8a70d691cd8b2dc by carsten.draschner
class that could become part of pipelined semantic similaritz estimation
(commit: 4050c3b)
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/vectorizer/rdf_triple_minHash_df_converter.scala
Commit 953705efffac0ce8a52f92577be9b386af7316dd by carsten.draschner
comment because strange kind of test
(commit: 953705e)
The file was modifiedsansa-ml-spark/src/test/scala/net/sansa_stack/ml/spark/kernel/RDFFastGraphKernelTests.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/test_run_worksheet.sc (diff)
Commit 331a673360b4d8831b1b83aef30ed4b520b0252e by carsten.draschner
we do not need this anymore. we wont create a central class but multiple. each for one approach
(commit: 331a673)
The file was removedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/Semantic_Similarity_Estimator.scala
Commit 906fcbb34015799d588177de9767b50cef297495 by carsten.draschner
we might create a node indixer in future but currently we reuse the nlp stack inspired pipeline
(commit: 906fcbb)
The file was removedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/utils/NodeIndexer.scala
Commit dff713bd20095179c6cf95b1e37ca532caef06e6 by carsten.draschner
in general this is now an old approach because we switched to nlp based pipeline
(commit: dff713b)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/minHashTryOut.scala (diff)
Commit 34096bc31da2d3a7f48e159280543927e39f11d6 by carsten.draschner
added some input descriptions and added rodriguezegenhofer, batet and tversky but some questions to these formulas stored in todos
(commit: 34096bc)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/minHash_rdf_over_text_pipeline.scala (diff)
Commit c8d96af46532c51787bf8362171a9a20d2f25082 by carsten.draschner
new modular feature extractor for similarity estations. tested and working!
(commit: c8d96af)
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/utils/RDF_Feature_Extractor.scala
Commit 0db5b8c459e7522b6a45ee7d3f3ab7b2068f7641 by carsten.draschner
modular pipeline for minhash. spark session start, file readin and feature extraction added and working
(commit: 0db5b8c)
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/MinHash.scala
Commit 9644606606efdca422b975fb9cf874be543d0888 by carsten.draschner
ideas for further approaches siliar to jaccard noted
(commit: 9644606)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/minHash_rdf_over_text_pipeline.scala (diff)
Commit bc731ec790b80019af21965c888999e2e50a30fe by carsten.draschner
integrated count Vectorizer and made all important parameters a block and not hardcoded in constructer or method calls if they should be avaulable later in cmd call
(commit: bc731ec)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/MinHash.scala (diff)
Commit 373ebd24c9157ea287639993ae45a89a4de5a9d8 by carsten.draschner
added minhash with reformatting columns for easier usability and clearer reading. tested and working
(commit: 373ebd2)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/MinHash.scala (diff)
Commit fea7b851c3e043365fc7fbce00ae6012970e2ee7 by carsten.draschner
older node based approach. deleted because will be replaced by RDF_Feature_Extractor
(commit: fea7b85)
The file was removedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/utils/FeatureDataframeGenerator.scala
Commit 296dfbd1dc00d6526b5341da1d42e1ef423c20ea by carsten.draschner
older alternative feature extraction approach. deleted because will be replaced by RDF_Feature_Extractor
(commit: 296dfbd)
The file was removedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/utils/NodeFeatureFactory.scala
Commit a6fcbacc6406c5523c230a5ae9ee6c389e27cd56 by carsten.draschner
old dummy class. not used anymore
(commit: a6fcbac)
The file was removedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/vectorizer/rdf_triple_minHash_df_converter.scala
Commit 625e080e78ed8180001eadf924513a55efd68d18 by carsten.draschner
module for metagraph factory. the central inforamtion for an experiment got created now for each pair are next up
(commit: 625e080)
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/utils/Similarity_Experiment_Meta_Graph_Factory.scala
Commit 51c9b7436104008a82267dc42bf441cd110428d3 by carsten.draschner
replaced this tryout workflow bz over text pipeline and clean version will be in run minhash
(commit: 51c9b74)
The file was removedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/minHashTryOut.scala
Commit 756615c7be0f93639abea3b95e999513ba9ee533 by carsten.draschner
rearranged parameter, removed stringsfrom central uri of experiment, and fixed bug RDD creation. now tested and working
(commit: 756615c)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/utils/Similarity_Experiment_Meta_Graph_Factory.scala (diff)
Commit cff627358e1cb72fdf71a5c3400a53f9998dca5a by carsten.draschner
added line for metagraph creeation. working and tested but not implemented the triples for each pair od elements with its similarity value
(commit: cff6273)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/MinHash.scala (diff)
Commit 9c897212b8f5d3d334483f8bfa72d9ab6ff766cf by carsten.draschner
added creation of triples also for small comparison elements, tested and done
(commit: 9c89721)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/utils/Similarity_Experiment_Meta_Graph_Factory.scala (diff)
Commit a71f4f02470900d7c827eaa8c96c43488addfaeb by carsten.draschner
store rdf representation of minhash similarity assesment. tested and working
(commit: a71f4f0)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/MinHash.scala (diff)
Commit 20c0000b40a1399be597d227e2abc38f48cd28d4 by carsten.draschner
placeholder which is now replaced by minhash
(commit: 20c0000)
The file was removedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/RDF_MinHashLSH.scala
Commit 07990dc3a00efa1de039285f882e83b4ca21bbb4 by carsten.draschner
added datetime to outputpath so no conflicting output paths should occur
(commit: 07990dc)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/MinHash.scala (diff)
Commit 4cef2eedf8efedd1d33bc2063896f14f74fa5fc5 by carsten.draschner
noted todo that relation might be changed to something different in future
(commit: 4cef2ee)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/utils/Similarity_Experiment_Meta_Graph_Factory.scala (diff)
Commit 9baafb54a8efb073a9c4e0029877af1c9d1ff2cc by carsten.draschner
wrote first complete approach for a jaccard modular model corresponding to minhash from spark to make pipeline interchangeable
(commit: 9baafb5)
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/JaccardModel.scala
Commit 325f3db741383ad832548524db4cddf21843d396 by carsten.draschner
fixed bug that lit cannot hadle Vector but typedLit can. also wrong df was handed over because withColumn works not inplace. now everthing is tested and working for jaccard
(commit: 325f3db)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/JaccardModel.scala (diff)
Commit 5385cd31026c8d822f79f34fb1e753ce314c2360 by carsten.draschner
created complete running pipeline for jaccard with minor changes compared to Minhash. only changes needed where in some parameters, especially the minhash vs kaccard parameters and the call of jaccardmodel vs minhashmodel. tested and running.
(commit: 5385cd3)
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/Jaccard.scala
Commit c1fb90248cd7d8edc415f567b29268a6217fd9b0 by carsten.draschner
added missing parameter for number hash tables and changed name to more generic name so later superclass can easier handle differences in different simialrity estimators
(commit: c1fb902)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/MinHash.scala (diff)
Commit d1d39fa605e0608c149eb88095ec7df45d4fe4fd by carsten.draschner
remove outcommented parts from minhash
(commit: d1d39fa)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/Jaccard.scala (diff)
Commit 5d182400e6d712ce7862ca66325e7b4648ff78f7 by carsten.draschner
created very generic superclass for several similarity estioations so reusablity is enlarged. tversky uses this already. tested and working!
(commit: 5d18240)
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/GenericSimilarityEstimator.scala
Commit ba9be76d14c2d7635197073bfda54196b05b80f4 by carsten.draschner
tversky model implemented as current best example of reusing code for several Similarity estions. tested and working over Tversky in run
(commit: ba9be76)
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/TverskyModel.scala
Commit d5df0967460642ea1ef1f8038448d900446023aa by carsten.draschner
run of tversky with new reusability. tested and working. next up making this pipeline also more reusable in code perspective and reusing generic code
(commit: d5df096)
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/Tversky.scala
Commit 59e1c98acfcce3b4f4e18c663bdae9d3f8293c8a by carsten.draschner
ideas for making pipeline more generic
(commit: 59e1c98)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/Jaccard.scala (diff)
Commit ca9c3ebac3783330fb764def83c021ea38bb9f44 by carsten.draschner
added keep column option to switch between behavior known from spark min hash and behavior which is needed for metagraph creation. also changed esrimation udf to protected
(commit: ca9c3eb)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/TverskyModel.scala (diff)
Commit 3947af8a7dc914e2ad3b17f933bb18e1bd4653c1 by carsten.draschner
changed complete jaccard to reusing generic similarity estimator code. tested and working. added keep column option to switch between behavior known from spark min hash and behavior which is needed for metagraph creation. also changed estimation udf to protected
(commit: 3947af8)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/JaccardModel.scala (diff)
Commit 563178767e4f1130f67a8df68c912b8cb8f94195 by carsten.draschner
changed jaccard pipeline to more generic one like tversky. tested and working
(commit: 5631787)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/Jaccard.scala (diff)
Commit c96afaa2bcb75fa524064585e06377015d093939 by carsten.draschner
chanhed all fitting elements to protected. so not callable from outside. also build in keep column for metagraph vs minhash behavior. also added options for filtering and ordering results as intended
(commit: c96afaa)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/GenericSimilarityEstimator.scala (diff)
Commit ba1b6fb0d90deb62ad5a05e1ee0435e9747e8438 by carsten.draschner
createt batet distance. everthing what is needed is adjusted and now optimized compared to minhashOverpipeline cecause log is now inside udf. alternative still in comment
(commit: ba1b6fb)
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/BatetModel.scala
Commit 1706977e4fc93fb4a05edbc7acf29087b35647f0 by carsten.draschner
created braun blaquet model. pretty similar to jaccard
(commit: 1706977)
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/BraunBlanquetModel.scala
Commit 60b84789115b14ffc1dbda5952a04853069ccc82 by carsten.draschner
removed non necessary code which was left in comment
(commit: 60b8478)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/JaccardModel.scala (diff)
Commit 6ac10a55e9140a05ec38328720abfd957d0367dd by carsten.draschner
created chiai model can be used as jaccardModel etc
(commit: 6ac10a5)
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/OchiaiModel.scala
Commit 560916fec1923678fac4134fa28d357317c15237 by carsten.draschner
created ROdriguez egenhofer on basis of tversky... can be called similar but now you have only alpha but not betha
(commit: 560916f)
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/RodriguezEgenhoferModel.scala
Commit 63430bdd014e899a0e0628a3d33044f6e8ae652f by carsten.draschner
created simpson model. similar to braun blanquet. only subsumer different
(commit: 63430bd)
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/SimpsonModel.scala
Commit 75ccc0d8c1db33cebd2bfd033da675fae28705e2 by carsten.draschner
tversky pipeline now with keep column option used for now working storage also for nn estiamtion to meta rdf graph
(commit: 75ccc0d)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/Tversky.scala (diff)
Commit 97964f0be8aad52671faf1a884be480086ace278 by carsten.draschner
added typed to parameters in the beginning and changed approxNearestNeigbors st. it has another column so metagraph creation can handle it
(commit: 97964f0)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/MinHash.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/Jaccard.scala (diff)
Commit 60249afe9fdef83dd960a3860dd887963068ab95 by carsten.draschner
removed betha which was left from tversky
(commit: 60249af)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/RodriguezEgenhoferModel.scala (diff)
Commit 4b75c29aa942b0ec8b6d185d6c34c2aae50bf36b by carsten.draschner
rename generic superclass so it is alligned with the _Model name domain
(commit: 4b75c29)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/SimpsonModel.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/BraunBlanquetModel.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/RodriguezEgenhoferModel.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/JaccardModel.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/BatetModel.scala (diff)
The file was removedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/GenericSimilarityEstimator.scala
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/TverskyModel.scala (diff)
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/GenericSimilarityEstimatorModel.scala
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/OchiaiModel.scala (diff)
Commit 336aa1dab298225e90f731563814e5a6faadb683 by carsten.draschner
new error handling and new set opportunitiy. now over this so it is more alligned with the mllib behavior
(commit: 336aa1d)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/TverskyModel.scala (diff)
Commit 14065e1efd4cdb26794ae2cc713cc6d6c0d32389 by carsten.draschner
new set opportunitiy. now over this so it is more alligned with the mllib behavior
(commit: 14065e1)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/GenericSimilarityEstimatorModel.scala (diff)
Commit fe4167119cb8b541dadb5b7938983b963755adc4 by carsten.draschner
new set opportunitiy. now over this so it is more alligned with the mllib behavior and different error handling
(commit: fe41671)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/RodriguezEgenhoferModel.scala (diff)
Commit 34f27b9881bdfc1c3479b6f6928fd2338e2a2b7e by carsten.draschner
new set opportunitiy. now over this so it is more alligned with the mllib behavior
(commit: 34f27b9)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/utils/RDF_Feature_Extractor.scala (diff)
Commit bd9d7a6853759dcb975f3302850ab3f6a159c045 by carsten.draschner
different call of set parameters and also some more output for further discussions
(commit: bd9d7a6)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/MinHash.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/Jaccard.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/Tversky.scala (diff)
Commit 77993903b1c8ec876c3762b7ce6b5fb8b8b2c621 by carsten.draschner
added dice model as close model to jaccard
(commit: 7799390)
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/DiceModel.scala
Commit 6a0e4df8d60699f7ca4edd1c73a24cd1bf0f59dc by carsten.draschner
added checks for nan bug. also inserted default alpha and beta so devide by zeo can not occur by default.
(commit: 6a0e4df)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/TverskyModel.scala (diff)
Commit f8f5dbad86a5af1864aeee6d8c5f9b44d1a87436 by carsten.draschner
print lines of all major parts so run can be inspected more easily
(commit: f8f5dba)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/Tversky.scala (diff)
Commit cf2447b4688712417b52b41eeeea738ef53e72fb by carsten.draschner
call of column name checks to look if default values fit if they are not net
(commit: cf2447b)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/SimpsonModel.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/MinHash.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/GenericSimilarityEstimatorModel.scala (diff)
Commit 3c3493cab244ecccc7b91963a51d9c44ec4b39c1 by carsten.draschner
Feature extractor now working on basis of dataframe which is read in by spark.read.rdf
(commit: 3c3493c)
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/utils/FeatureExtractor.scala
The file was removedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/utils/FeatureExtractor.scala
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/utils/FeatureExtractorModel.scala
Commit a2b7e0f329e61cdb2d51b610ace243db6bb321c3 by carsten.draschner
major pipeline calling for all similarity experiments. started to optmize for sansa server execution. creates csv with important information
(commit: a2b7e0f)
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/experiment/SimilarityPipelineExperiment.scala
Commit 770d75b22782815751275f70823aa61a16e3c705 by carsten.draschner
optimize featureExtractor for parallel computing
(commit: 770d75b)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/utils/FeatureExtractorModel.scala (diff)
Commit 7e30bcb2e87518a02a2656f6bf60a102a8eef3b2 by carsten.draschner
optimization to run on spark server over cmd line tools
(commit: 7e30bcb)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/minHash_rdf_over_text_pipeline.scala (diff)
Commit 135995194bb9b2472d8552e1a80388a602d4093a by carsten.draschner
mainly allow hyperparameter evaluation
(commit: 1359951)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/experiment/SimilarityPipelineExperiment.scala (diff)
Commit f53a3a8480422773a765d3b03e8ea1eedd3d85cb by carsten.draschner
clearer structure, more aligned to scala camelCase and new handling of default values
(commit: f53a3a8)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/GenericSimilarityEstimatorModel.scala (diff)
Commit 2a1d3b4bcc53187f77093b89b6aa1f3b8418dd10 by carsten.draschner
parameteras are setable over config file. now with cmd calling only argument needed to specify path to config
(commit: 2a1d3b4)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/experiment/SimilarityPipelineExperiment.scala (diff)
Commit b6c5b8eacc4eb5f2def3c71b5ca2665c0842c233 by carsten.draschner
not needed anymore becausse the implementation in sansa which was reused here does not allign with paper
(commit: b6c5b8e)
The file was removedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/RodriguezEgenhoferModel.scala
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/test_run_worksheet.sc (diff)
Commit 9a107f9d8c073a3dfe16fe138a95d5f9417b40e5 by carsten.draschner
fixed bugs, alligned scala style and set default value handling
(commit: 9a107f9)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/JaccardModel.scala (diff)
Commit c1ab861677d34c072bb64e2b39fb85ea3692e743 by carsten.draschner
started showcasing object for minimal calls needed for example for simpleML
(commit: c1ab861)
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/examples/minimalCalls.scala
Commit 6709bdbcbcd0aa46d275e27179f3750f6c694f7c by carsten.draschner
first placeholder class for possible node indexer which can work as alternative to read in rdf into spark as df
(commit: 6709bdb)
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/utils/NodeIndexer.scala
Commit 8257357dc60b9632ea51f27b33b23306a67435da by carsten.draschner
chnaged to use shade dependecy which creates huge jar including all needed packages. needed for server export so all imports run as intended
(commit: 8257357)
The file was modifiedsansa-ml-spark/pom.xml (diff)
Commit b793bc5e9288436250a777a1e4c505c60191cb09 by carsten.draschner
changed default value of features column
(commit: b793bc5)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/utils/FeatureExtractorModel.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/OchiaiModel.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/DiceModel.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/SimpsonModel.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/TverskyModel.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/BatetModel.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/BraunBlanquetModel.scala (diff)
Commit 2c7822e37a4fb6b9a29911a3f3206f7208eb85c3 by carsten.draschner
allign model software code design
(commit: 2c7822e)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/DiceModel.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/TverskyModel.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/GenericSimilarityEstimatorModel.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/BatetModel.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/BraunBlanquetModel.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/SimpsonModel.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/OchiaiModel.scala (diff)
Commit 2c041b6f1ee75893585a1566b2ed967ecf396694 by carsten.draschner
collected all currently implemented similarity estiomations as minimal calls
(commit: 2c041b6)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/examples/minimalCalls.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/examples/minimalCalls.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/TverskyModel.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/TverskyModel.scala (diff)
Commit 0f4c19a9dae34becaa16ec72f5eae3adfefb9de2 by carsten.draschner
changed main class to experiment call
(commit: 0f4c19a)
The file was modifiedsansa-ml-spark/pom.xml (diff)
Commit a919c19946e71532d68bacb43e75bcd2be0b6a7a by carsten.draschner
config resolver created by farshad to handle dynamically the file path of local and hdfs paths
(commit: a919c19)
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/utils/ConfigResolver.scala
Commit 6c60d80c0fb094d92595a2b9e3a5bfd0c8f03fb1 by carsten.draschner
move outputpath to config, readin chnages by usage of config resolver so hdfs and local is usable, currently not working
(commit: 6c60d80)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/experiment/SimilarityPipelineExperiment.scala (diff)
Commit 6a8c228fa35dbb5ca77dfe275291c9ab79b1f980 by carsten.draschner
created object to test server usage with minimal complexity
(commit: 6a8c228)
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/experiment/MinimalServerExperiment.scala
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/experiment/SimilarityPipelineExperiment.scala (diff)
Commit 07147f7a4d2bb59eba75913bc217718f68f0ade8 by carsten.draschner
file lister implemented by farshad so both local and hdfs files are listable
(commit: 07147f7)
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/utils/FileLister.scala
Commit 5ffa1450832d9748f589f391163b35f6ed7cd636 by Patrick Westphal
Fixed wrong method calls in similarity examples
(commit: 5ffa145)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/Tversky.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/Jaccard.scala (diff)
Commit eebee24aaf2eabac345c066e940e15deb5df541f by Patrick Westphal
Made FeatureExtractorModel inherit frm Transformer
(commit: eebee24)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/utils/FeatureExtractorModel.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/Tversky.scala (diff)
Commit 7e6c7f9392dc9b523c9fb155992885cd6466a9aa by carsten.draschner
added number of runs as parameter for more stable processing times calculation
(commit: 7e6c7f9)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/experiment/SimilarityPipelineExperiment.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/test_run_worksheet.sc (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/Jaccard.scala (diff)
The file was removedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/experiment/MinimalServerExperiment.scala
Commit 000a9f4c2324335949a5ca0d00c98a2000a9404a by carsten.draschner
this was first attempt but is now resolved in more cleaner substructures
(commit: 000a9f4)
The file was removedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/minHash_rdf_over_text_pipeline.scala
Commit ca4d9f5b157842e210cbcedf8c712deaa55fa773 by carsten.draschner
better place now for certain try outs
(commit: ca4d9f5)
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/utils/test_run_worksheet.sc
The file was removedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/test_run_worksheet.sc
Commit 2027a671dfff30284e3d187f0ee1332692aa9f12 by carsten.draschner
tversky added and key retrieval aligned.
(commit: 2027a67)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/experiment/SimilarityPipelineExperiment.scala (diff)
Commit 86896bffa319acf86b794c7eed4c24132eb8d825 by carsten.draschner
bug fixing if both feature vectors are empty
(commit: 86896bf)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/JaccardModel.scala (diff)
Commit 6dea37eb3ec08d669a8378f38bce0c4c8113efc2 by carsten.draschner
if feature vector union emtpty return 0
(commit: 6dea37e)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/TverskyModel.scala (diff)
Commit 09bab6c554bf69f3358f72daebd55fe9e29f270e by carsten.draschner
key generation once in the beginning and length of dataframes in prints
(commit: 09bab6c)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/experiment/SimilarityPipelineExperiment.scala (diff)
Commit a1ad0814fc076068592825311a196fa1407a0a34 by carsten.draschner
added further examples in minimal calls like subset estimation or stacked approaches
(commit: a1ad081)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/examples/minimalCalls.scala (diff)
Commit 0558c5386181d16326b79cf8667eb96bf49e0ac6 by carsten.draschner
tests with filter option for only considering movies
(commit: 0558c53)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/examples/minimalCalls.scala (diff)
Commit 41f86534f3a87691f4e64e59480da67d099186b5 by carsten.draschner
added stacked option and give the option to stop at specified points in pipeline to try out parts of pipeline
(commit: 41f8653)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/experiment/SimilarityPipelineExperiment.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/utils/FeatureExtractorModel.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/Jaccard.scala (diff)
Commit d2d9ef6f923cc19210c56cf9fac9b4eb60812972 by carsten.draschner
autoimports reorganize and limit dataframe show to 10 rows to test if this ressults in out of memory
(commit: d2d9ef6)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/experiment/SimilarityPipelineExperiment.scala (diff)
Commit aecc3a944e6a65f7515e528d16b90f75dad6422c by carsten.draschner
removed some command line print typos and rearranged minor things
(commit: aecc3a9)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/experiment/SimilarityPipelineExperiment.scala (diff)
Commit 831bb80db40ed872e4c3ebd426f3ef2e3f6c42ff by carsten.draschner
not needed, merged into FeatureExtractorModel. we now have overloaded transform for Dataset/Dataframe and for RDD Triple Node
(commit: 831bb80)
The file was removedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/utils/RDF_Feature_Extractor.scala
Commit bdfa143dc0c541483615a0a6775e7458a34e8eba by carsten.draschner
added comments aligned to scala doc and limit dataframe output in cmd print
(commit: bdfa143)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/experiment/SimilarityPipelineExperiment.scala (diff)
Commit b0b715d5531fa2d9284ea786db2f0b124f8f215d by carsten.draschner
created docstrings aligned to scala doc and merged overloaded ttanform to make use of RDD Triple Node based read in possible
(commit: b0b715d)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/utils/FeatureExtractorModel.scala (diff)
Commit dcd824897b643ad39fc9ea2d47c76d66eabd575d by carsten.draschner
created scala doc aligned docstrings
(commit: dcd8248)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/GenericSimilarityEstimatorModel.scala (diff)
Commit 04eaa21f04c2a24f098b9e84301f6ce54e6bbc14 by carsten.draschner
scala docstrings to describe shortlz what this class is about
(commit: 04eaa21)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/examples/minimalCalls.scala (diff)
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/utils/SimilarityExperimentMetaGraphFactory.scala
The file was removedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/utils/Similarity_Experiment_Meta_Graph_Factory.scala
Commit d3a6f7c39fc83a5fb60c54c88c0a6f3c33c07b72 by carsten.draschner
changed the name for one relation
(commit: d3a6f7c)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/utils/SimilarityExperimentMetaGraphFactory.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/utils/FeatureExtractorModel.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/JaccardModel.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/Tversky.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/Jaccard.scala (diff)
Commit 7f6e75cb5d706ec9aaa956061d506e77d9f0f0e2 by carsten.draschner
refactor camel case and switch to dataframe from rdd read in
(commit: 7f6e75c)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/MinHash.scala (diff)
Commit 77c8060d852726a51d1b5399ac60e89e158a1e9f by carsten.draschner
move local[*] to conf and not in code, quick and dirty filtering for hands on try out. started with logging instead of printing
(commit: 77c8060)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/experiment/SimilarityPipelineExperiment.scala (diff)
Commit 9de95e2e3175481f091484f91788ce34bccbe9d2 by carsten.draschner
seperated filter part for relevant uris
(commit: 9de95e2)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/examples/minimalCalls.scala (diff)
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/ReadMe.md
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/examples/minimalCalls.scala (diff)
Commit b8bdeeac16f1c245519b680903161125a90f3913 by carsten.draschner
renamed package to more suited and camelcase name
(commit: b8bdeea)
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarityEstimationModels/BraunBlanquetModel.scala
The file was removedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/OchiaiModel.scala
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/experiment/SimilarityPipelineExperiment.scala (diff)
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarityEstimationModels/BatetModel.scala
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/examples/minimalCalls.scala (diff)
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarityEstimationModels/JaccardModel.scala
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarityEstimationModels/OchiaiModel.scala
The file was removedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/JaccardModel.scala
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarityEstimationModels/DiceModel.scala
The file was removedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/GenericSimilarityEstimatorModel.scala
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/Tversky.scala (diff)
The file was removedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/BatetModel.scala
The file was removedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/BraunBlanquetModel.scala
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/Jaccard.scala (diff)
The file was removedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/SimpsonModel.scala
The file was removedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/TverskyModel.scala
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarityEstimationModels/GenericSimilarityEstimatorModel.scala
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarityEstimationModels/SimpsonModel.scala
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarityEstimationModels/TverskyModel.scala
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/Resnik.scala
The file was removedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarity_measures/DiceModel.scala
Commit 2593ad9f10247a175a86cff2658453f7f58e6a50 by carsten.draschner
started with unit tests for semantic simialrity estimation
(commit: 2593ad9)
The file was addedsansa-ml-spark/src/test/resources/similarity/movie.nt
The file was addedsansa-ml-spark/src/test/scala/net/sansa_stack/ml/spark/similarity/similarityUnitTest.scala
Commit a1b3f163b297ca8cc8dbdd574bdbf325f734a1c5 by carsten.draschner
moved spark setup inside each run so overhead is present in each run
(commit: a1b3f16)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/experiment/SimilarityPipelineExperiment.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/ReadMe.md (diff)
Commit 4d96d04d0e9256e1b1b7d0994d51addb2bdfc352 by carsten.draschner
tests changed from show to count because it is faster
(commit: 4d96d04)
The file was modifiedsansa-ml-spark/src/test/scala/net/sansa_stack/ml/spark/similarity/similarityUnitTest.scala (diff)
Commit e428ff461136e609afd31e5854e3645890b99ef9 by carsten.draschner
added information s.t. metagraph creator can easily fetch that all are similarity estimators
(commit: e428ff4)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarityEstimationModels/GenericSimilarityEstimatorModel.scala (diff)
Commit 5f8766434574d0ddfe0c419c96d3c70340cd7d6e by carsten.draschner
changed default column labeling from underscore to camel case
(commit: 5f87664)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarityEstimationModels/DiceModel.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarityEstimationModels/BraunBlanquetModel.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarityEstimationModels/SimpsonModel.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarityEstimationModels/OchiaiModel.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarityEstimationModels/TverskyModel.scala (diff)
Commit d0cb5fd85638731ae6bb82ae9f71ca88ffe8416c by carsten.draschner
created minhash on basis of apache spark minhash lsh and added behavior as in other generic similarity estimators to allow better follow up handling of comulmns for e.g. metagraph creation
(commit: d0cb5fd)
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarityEstimationModels/MinHashModel.scala
Commit 2e3a137b7f26c18f8c5021d478d72d9319630b8e by carsten.draschner
created novel method for metagraph creation using better literal creation etc and clearer parameters
(commit: 2e3a137)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/utils/SimilarityExperimentMetaGraphFactory.scala (diff)
Commit f5e69849548fe024dcaec37a7a64887f3e81d58f by carsten.draschner
cleaned shape and tryouts for other datasets
(commit: f5e6984)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/experiment/SimilarityPipelineExperiment.scala (diff)
Commit ad651796c108c9294b66164a44898a4b27af3c05 by carsten.draschner
minimal calls now with novel minHash and novel metagraph creation
(commit: ad65179)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/examples/minimalCalls.scala (diff)
Commit 6a9de1c7f3c9d715825fc3feed0ed0533bf7ccaf by carsten.draschner
new default value for column labeling
(commit: 6a9de1c)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarityEstimationModels/GenericSimilarityEstimatorModel.scala (diff)
Commit 34d6453e02e81c8569a1a27a6da0be7de2f48d76 by carsten.draschner
removed because not needed anymore. we created a new file for full pipeline with all estimators instead having dedicated one for each model
(commit: 34d6453)
The file was removedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/Tversky.scala
Commit ddf28c1b2d5b02f68fa8bc614ae59c484c5f0886 by carsten.draschner
new approach to have one concise pipeline for easy usage instead of needed construction of each pipeline
(commit: ddf28c1)
The file was addedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/SimilarityPipeline.scala
Commit 87e7b6bf6c22d361910248a356593d843c3f2f23 by carsten.draschner
nin hash depricated and now moved to similarityPipeline
(commit: 87e7b6b)
The file was removedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/MinHash.scala
Commit bf3c6d61185c84193f4302d0260427ca27835835 by carsten.draschner
changes for novel metagraph creation
(commit: bf3c6d6)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/Jaccard.scala (diff)
Commit f5958f2be4f15aa9d6284c9e90874585886deeae by carsten.draschner
replaces show by count to make tests faster
(commit: f5958f2)
The file was modifiedsansa-ml-spark/src/test/scala/net/sansa_stack/ml/spark/similarity/similarityUnitTest.scala (diff)
Commit f3287963462395b388bb52e53533f356eefca59d by carsten.draschner
added comments and parameters, pipeline is working
(commit: f328796)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/SimilarityPipeline.scala (diff)
Commit c9588a5b60e9d43878c644a8eb70639cd3a81deb by carsten.draschner
added another alternative to start main
(commit: c9588a5)
The file was modifiedsansa-ml-spark/pom.xml (diff)
Commit 9a025f12d7151bdfa723ece18488c80b5dd80b08 by carsten.draschner
jaccard now merge in similarity Pipeline
(commit: 9a025f1)
The file was removedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/run/Jaccard.scala
Commit 87182834d49599f1d8596a1714a46d0cdba5f2a7 by carsten.draschner
keeps track of experiment number when unning over extensive hyperparamter grid and printing this in cmd line
(commit: 8718283)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/experiment/SimilarityPipelineExperiment.scala (diff)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/similarityEstimationModels/MinHashModel.scala (diff)
Commit 7dd393139d6026d3b3caca467c3f9cf0546c0caa by carsten.draschner
extended tests and give each pupeline module its own test
(commit: 7dd3931)
The file was modifiedsansa-ml-spark/src/test/scala/net/sansa_stack/ml/spark/similarity/similarityUnitTest.scala (diff)
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/filter_box_right.png
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/index/index-a.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/index/index-p.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/ref-index.css
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/filterboxbarbg.png
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/utils/SimilarityExperimentMetaGraphFactory.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/index/index-r.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/fullcommenttopbg.gif
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/algorithms/BorderFlow.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/datatypes/Clustering.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/navigation-li-a.png
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/classification/package.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/filter_box_left2.gif
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/typebg.gif
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/package.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/classification/KB$$KB$KB.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/similarity/similarityEstimationModels/JaccardModel.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/diagrams.js
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/similarity/similarityEstimationModels/GenericSimilarityEstimatorModel.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/kge/linkprediction/models/TransE.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/kernel/package.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/arrow-right.png
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/classification/RefinementOperator.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/algorithms/SilviaClustering$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/mining/amieSpark/Rules$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/index/index-y.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/selected-right.png
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/index/index-s.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/index/index-l.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/mining/amieSpark/KBObject$$KB.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/filterboxbarbg.gif
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/jquery-ui.js
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/mining/amieSpark/KBObject$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/classification/PerformanceMetrics$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/classification/ClassMembership$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/mining/amieSpark/SQLSchema$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/index/index-i.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/index/index-x.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/datatypes/package.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/similarity/similarityEstimationModels/MinHashModel.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/class_big.png
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/datatypes/DbStatusEnum$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/classification/TDTInducer$$TDTInducer.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/selected2-right.png
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/index/index-o.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/similarity/run/package.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/index/index-f.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/index/index-e.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/filterboxbg.gif
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/outliers/vandalismdetection/Classifier$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/index/index-c.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/utils/DBCLusterer.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/kernel/RDFFastGraphKernel.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/similarity/similarityEstimationModels/OchiaiModel.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/classification/ClassMembership$$ClassMembership.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/mining/amieSpark/KBObject$$Atom.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/kernel/RDFFastTreeGraphKernel$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/type_big.png
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/utils/package.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/algorithms/PIC$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/selected-right-implicits.png
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/kge/linkprediction/run/TransERun$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/similarity/experiment/SimilarityPipelineExperiment$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/package_big.png
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/mining/amieSpark/package.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/classification/DLTree.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/algorithms/package.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/kge/linkprediction/crossvalidation/package.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/type_diagram.png
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/kge/linkprediction/crossvalidation/Bootstrapping.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/utils/DataProcessing.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/package.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/utils/DataFiltering.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/utils/package.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/index/index-j.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/index/index-n.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/datatypes/MdsCoordinates.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/index/index-b.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/type.png
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/utils/Common$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/class.png
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/mining/amieSpark/MineRules$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/kge/linkprediction/crossvalidation/rateException.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/object_diagram.png
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/mining/amieSpark/EmptyRDFGraphDataFrame$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/valuemembersbg.gif
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/algorithms/Kmeans$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/kernel/RDFFastTreeGraphKernelApp$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/outliers/anomalydetection/AnomalyWithHashingTF$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/object_to_class_big.png
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/remove.png
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/permalink.png
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/mining/amieSpark/MineRules$$Algorithm.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/index.js
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/similarity/similarityEstimationModels/TverskyModel.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/packagesbg.gif
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/outliers/package.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/mining/amieSpark/RDFGraphDataFrame.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/kernel/RDFFastTreeGraphKernel_v2.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/unselected.png
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/trait.png
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/package$$clusterT.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/outliers/anomalydetection/AnomalyWithHashingTF.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/ownderbg2.gif
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/kge/package.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/similarity/examples/package.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/trait_big.png
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/utils/FeatureExtractorModel.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/selected.png
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/index/index-d.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/similarity/similarityEstimationModels/package.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/index.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/constructorsbg.gif
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/outliers/vandalismdetection/package.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/signaturebg.gif
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/index/index-t.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/package.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/algorithms/ClusterAlgo.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/index.css
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/similarity/run/Resnik$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/algorithms/RDFByModularityClustering$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/datatypes/Spark.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/tools.tooltip.js
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/algorithms/PIC.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/index/index-_.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/kge/linkprediction/models/DistMult.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/kge/linkprediction/run/package.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/mining/amieSpark/Rules$$RuleContainer.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/kge/linkprediction/crossvalidation/CrossValidation.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/object_big.png
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/ownerbg2.gif
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/datatypes/MdsCoordinate.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/conversionbg.gif
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/utils/Grid.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/filterbg.gif
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/datatypes/Cluster.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/scheduler.js
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/algorithms/FirstHardeninginBorderFlow$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/mining/amieSpark/RDFGraphNative.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/similarity/similarityEstimationModels/DiceModel.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/arrow-down.png
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/mining/package.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/similarity/run/SimilarityPipeline$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/modernizr.custom.js
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/algorithms/Kmeans.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/outliers/anomalydetection/package.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/algorithms/SilviaClustering.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/algorithms/MultiDS.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/similarity/similarityEstimationModels/SimpsonModel.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/datatypes/SpatialObject.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/similarity/examples/minimalCalls$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/kernel/RDFFastTreeGraphKernel_v2$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/classification/KB$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/signaturebg2.gif
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/kernel/Uri2Index$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/object_to_trait_big.png
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/template.js
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/trait_to_object_big.png
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/kge/linkprediction/evaluate/Evaluate$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/package.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/classification/ConceptsGenerator$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/datatypes/DBSCANParam.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/selected2.png
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/datatypes/AppConfig.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/classification/KB$$KB.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/classification/RefinementOperator$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/class_diagram.png
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/datatypes/CoordinatePOI.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/kernel/RDFFastGraphKernel$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/kge/linkprediction/models/package.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/classification/TDTClassifiers$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/datatypes/POI.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/object_to_type_big.png
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/kge/linkprediction/prediction/package.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/classification/TDTInducer$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/utils/NodeIndexerModel.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/kge/linkprediction/crossvalidation/withIndex.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/template.css
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/index/index-v.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/index.js
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/similarity/similarityEstimationModels/BatetModel.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/class_to_object_big.png
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/mining/amieSpark/AbstractRDFGraph.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/similarity/package.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/jquery.layout.js
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/algorithms/BorderFlow$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/kernel/RDFFastTreeGraphKernel.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/index/index-h.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/kge/linkprediction/prediction/Evaluate.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/mining/amieSpark/DfLoader$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/index/index-k.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/classification/Registrator.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/mining/amieSpark/RDFGraph.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/algorithms/RDFGraphPowerIterationClustering$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/package.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/package$$ClusteringAlgorithm$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/kge/linkprediction/run/TriplesRun$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/algorithms/DBSCAN$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/outliers/anomalydetection/AnomalyDetectionWithCountVetcorizerModel$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/defbg-green.gif
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/datatypes/DbPOI.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/algorithms/Encoder.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/outliers/vandalismdetection/parser/package.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/similarity/similarityEstimationModels/BraunBlanquetModel.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/index/index-g.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/kernel/RDFFastTreeGraphKernelUtil$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/outliers/anomalydetection/AnomalWithDataframeCrossJoin$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/index/index-w.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/kge/linkprediction/crossvalidation/kFold.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/classification/TDTClassifiers$$TDTClassifiers.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/classification/TermDecisionTrees$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/datatypes/Categories.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/algorithms/DBSCAN.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/kge/linkprediction/models/Models.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/trait_diagram.png
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/kge/linkprediction/crossvalidation/Holdout.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/diagrams.css
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/kge/linkprediction/evaluate/package.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/outliers/anomalydetection/AnomalyDetectionWithCountVetcorizerModel.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/outliers/vandalismdetection/VandalismDetection.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/filter_box_left.png
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/utils/NodeIndexer.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/kge/linkprediction/package.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/object.png
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/similarity/experiment/package.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/kge/linkprediction/prediction/PredictTransE.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/datatypes/DistanceMatrix.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/utils/FileLister$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/navigation-li.png
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/selected-implicits.png
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/datatypes/Datasets.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/outliers/vandalismdetection/parser/XML$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/kge/linkprediction/crossvalidation/kException.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/classification/ConceptsGenerator$$ConceptsGenerator.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/datatypes/Distance.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/index/index-m.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/algorithms/Distances.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/type_to_object_big.png
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/algorithms/RDFGraphPowerIterationClustering.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/ownerbg.gif
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/package.png
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/clustering/datatypes/Clusters.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/mining/amieSpark/RDFGraphLoader$.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/jquery.js
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/package.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/mining/amieSpark/DfLoader$$Atom.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/mining/amieSpark/RDFTriple.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/outliers/anomalydetection/AnomalWithDataframeCrossJoin.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/net/sansa_stack/ml/spark/utils/ConfigResolver.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/index/index-u.html
The file was addeddocs/scaladocs/0.7.1_ICSC_paper/lib/defbg-blue.gif
The file was addeddocs/index.md
The file was addeddocs/index.html
The file was modifieddocs/index.md (diff)
The file was removeddocs/index.html
Commit db600623d0ee2b76489a5b8fe70add61d47ce677 by GitHub
Update index.md

added first few links to project
(commit: db60062)
The file was modifieddocs/index.md (diff)
Commit e18a5a7fea2b2e7a11a219f8bd06844d8ca202fb by GitHub
Update index.md

better embedding of links
(commit: e18a5a7)
The file was modifieddocs/index.md (diff)
Commit 644b19ee79462d0ecee85ce50a72ce4bd60ea46b by GitHub
Update index.md

adding header
(commit: 644b19e)
The file was modifieddocs/index.md (diff)
The file was modifiedsansa-ml-spark/pom.xml (diff)
Commit caf057ba3f3cb2c4669dce4aa67bc19f9586ca1d by carsten.draschner
added more models to experments and fixed bug in counting total experiments
(commit: caf057b)
The file was modifiedsansa-ml-spark/src/main/scala/net/sansa_stack/ml/spark/similarity/experiment/SimilarityPipelineExperiment.scala (diff)
The file was addedsansa-ml-spark/src/main/resources/movieData/movie.nt
The file was removedsansa-ml-spark/src/main/resources/movie.nt
Commit 20275c26175cb6c043c2589d0bc9da36fb6ed608 by carsten.draschner
sample parameter setup for similarity evaluation
(commit: 20275c2)
The file was addedsansa-ml-spark/src/main/resources/parameterConfig.conf
Commit b9c1099d1158d2e14343a48750c8d0f3b3328713 by GitHub
Update .travis.yml

Java Stack size increased for Scala compiler
(commit: b9c1099)
The file was modified.travis.yml (diff)
The file was modified.travis.yml (diff)
The file was modified.travis.yml (diff)
The file was added.github/workflows/main.yml