** DATASET for train 1-passage score does not have to be 0,1. it can be a range from 0 to 1 (0,0.25,0.5,0.75,1) : we can get this core by llm and apply it in loss calculation. 2-dataset needs preprocesing of removing negetive or positive passage by llm. 3-miracle dataset: question = 2107 - passages = 21844 : some negetive passage can be related 4-cross ligual dataset can be useful : query = first language - passage = second language 5-swim-ir dataset : they have passage and they have created query from it : it is shit for persian 6-parsinlu dataset: question = 600 - passage = 600 : all are positive 7-persianqa dataset: question = 6306 - passage = less than queries : every passage has multiple queries : all are positive - be careful some query is impossible to anser 8-pquad dataset :question = 48273 - passage = 10082 : every passage has multiple queries : be careful some query is impossible to anser : all are positive 9-longragfa dataset: it is long doc and query and for evaluation : question = 250, passage = 1500 : not using 10-Synthetic-persian-qa-retrieval dataset : question = 223423, passage = 250000 : negetaive passage are not exactly different : needs preprocessing no train NDCG: 0.8452119768348717 Recall 7: 0.3373666606161222 Recall 12: 0.48390155482482855 Recall 20: 0.6340810809380268 Recall Variant: 0.44313617731261423 Precision 7: 0.4714285714285715 Precision 12: 0.41999999999999993 Precision 20: 0.358 train with 100 NDCG: 0.8007791818263832 Recall 7: 0.2617863643550479 Recall 12: 0.3759745806720163 Recall 20: 0.5564983103150418 Recall Variant: 0.36642345327979325 Precision 7: 0.3828571428571429 Precision 12: 0.3449999999999999 Precision 20: 0.311 train with 100 with lora NDCG: 0.8432282495018343 Recall 7: 0.33695911259587386 Recall 12: 0.4729916144600827 Recall 20: 0.6212526155736547 Recall Variant: 0.43208929205133273 Precision 7: 0.4685714285714285 Precision 12: 0.4099999999999999 Precision 20: 0.35200000000000004 train with 100 with promt