{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Simulating data and power analysis"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Tom Ellis, August 2017"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Created using FAPS version 2.6.6.\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "import faps as fp\n",
    "import matplotlib.pylab as plt\n",
    "import pandas as pd\n",
    "from time import time, localtime, asctime\n",
    "\n",
    "print(\"Created using FAPS version {}.\".format(fp.__version__))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Before committing to the time and cost of genotyping samples for a paternity study, it is always sensible to run simulations to test the likely statistical power of your data set. This can help with important questions regaridng study design, such as finding an appropriate balance between the number of families vs offspring per family, or identifying a minimum number of loci to type. Simulated data can also be useful in verifying the results of an analysis.\n",
    "\n",
    "FAPS provides tools to run such simulations. In this notebook we look look at:\n",
    "\n",
    "1. Basic tools for simulating genotype data.\n",
    "2. Automated tools for power analysis.\n",
    "3. Crafting custom simulations for specialised purposes.\n",
    "4. Simulations using emprical datasets (under construction).\n",
    "\n",
    "It is worth noting that I relied on loops for a lot of these tools, for the purely selfish reason that it was easy to code. Loops are of course slow, so if you work with these tools a lot there is ample scope for speeding things up (see especially the functions `make_offspring`, `make_sibships` and `make_power`)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Simulation building blocks"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Creating `genotypeArray` objects"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Simulations are built using `genotypeArrays`. See the section on these [here](http://localhost:8889/notebooks/docs/02%20Genotype%20data.ipynb) for more information.\n",
    "\n",
    "`make_parents` generates a population of reproductive adults from population allele frequencies.\n",
    "This example creates ten individuals.\n",
    "Note that this population will be in Hardy-Weinberg equilibrium, but yours may not."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "np.random.seed(37)\n",
    "allele_freqs = np.random.uniform(0.2, 0.5, 50)\n",
    "adults = fp.make_parents(10,  allele_freqs, family_name='adult')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "There are multiple ways to mate adults to generate offspring. If you supply a set of adults and an integer number of offspring, `make_offspring` mates adults at random."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array(['adult_7/adult_2', 'adult_1/adult_6', 'adult_8/adult_3',\n",
       "       'adult_8/adult_0', 'adult_0/adult_7'], dtype='<U15')"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "family1 = fp.make_offspring(parents = adults, noffs=5)\n",
    "family1.parents"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can also supply an explicit list of dams and sires, in which case the adults are paired in the order they appear in each list."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array(['adult_7/adult_2', 'adult_1/adult_6', 'adult_8/adult_3',\n",
       "       'adult_8/adult_0', 'adult_0/adult_7'], dtype='<U15')"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "family2 = fp.make_offspring(parents = adults, dam_list=[7,1,8,8,0], sire_list=[2,6,3,0,7])\n",
    "family2.parents"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Usually we really want to simulate half sib arrays. This can be done using `make_sibships`, which mates a single mother to a set of males."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array(['adult_0/adult_1', 'adult_0/adult_1', 'adult_0/adult_1',\n",
       "       'adult_0/adult_1', 'adult_0/adult_1', 'adult_0/adult_2',\n",
       "       'adult_0/adult_2', 'adult_0/adult_2', 'adult_0/adult_2',\n",
       "       'adult_0/adult_2', 'adult_0/adult_3', 'adult_0/adult_3',\n",
       "       'adult_0/adult_3', 'adult_0/adult_3', 'adult_0/adult_3',\n",
       "       'adult_0/adult_4', 'adult_0/adult_4', 'adult_0/adult_4',\n",
       "       'adult_0/adult_4', 'adult_0/adult_4'], dtype='<U15')"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "family3 = fp.make_sibships(parents=adults, dam=0, sires=[1,2,3,4], family_size=5)\n",
    "family3.parents"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For uneven sibship sizes, give a list of sizes for each family of the same length as `sires`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array(['adult_0/adult_1', 'adult_0/adult_1', 'adult_0/adult_1',\n",
       "       'adult_0/adult_1', 'adult_0/adult_1', 'adult_0/adult_2',\n",
       "       'adult_0/adult_2', 'adult_0/adult_2', 'adult_0/adult_2',\n",
       "       'adult_0/adult_3', 'adult_0/adult_3', 'adult_0/adult_3',\n",
       "       'adult_0/adult_4', 'adult_0/adult_4'], dtype='<U15')"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "family4 = fp.make_sibships(parents=adults, dam=0, sires=[1,2,3,4], family_size=[5,4,3,2])\n",
    "family4.parents"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Adding errors"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Real data almost always contains errors. For SNP data, these take the form of:\n",
    "\n",
    "* Missing data, where a locus fails to amplify for some reason\n",
    "* Genotyping errors, when the observed genotype at a locus is not the actual genotype.\n",
    "\n",
    "These are straightforward to include in simulated data. First generate some clean data again, and mate the parents."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "np.random.seed(85)\n",
    "allele_freqs = np.random.uniform(0.2, 0.5, 50)\n",
    "adults = fp.make_parents(10,  allele_freqs, family_name='adult')\n",
    "progeny = fp.make_sibships(parents=adults, dam=0, sires=[1,2,3,4], family_size=5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It is best to create the progeny before adding errors. Set the error rates and add errors at random."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "d, mu= 0.01, 0.0015 # values for dropout and error rate.\n",
    "# add genotyping errors\n",
    "adults_mu  = adults.mutations(mu)\n",
    "progeny_mu = progeny.mutations(mu)\n",
    "\n",
    "# add dropouts (to the mutated data)\n",
    "adults_mu  = adults_mu.dropouts(d)\n",
    "progeny_mu = progeny.dropouts(d)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`mutations` and `dropouts` make copies of the `genotypeArray`, so the original data remains unchanged. For example:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.0\n",
      "0.012000000000000002\n"
     ]
    }
   ],
   "source": [
    "print(adults.missing_data().mean())\n",
    "print(adults_mu.missing_data().mean())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Paternity and sibships"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Create a `paternityArray` and cluster into sibships as usual (more information on these objects can be found [here](https://github.com/ellisztamas/faps/blob/master/docs/03%20Paternity%20arrays.ipynb) and [here](http://localhost:8889/notebooks/docs/04%20Sibship%20clustering.ipynb)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [],
   "source": [
    "np.random.seed(85)\n",
    "allele_freqs = np.random.uniform(0.4, 0.5, 50)\n",
    "adults = fp.make_parents(10,  allele_freqs, family_name='adult')\n",
    "progeny = fp.make_sibships(parents=adults, dam=0, sires=[1,2,3,4], family_size=5)\n",
    "mothers = adults.subset(progeny.mothers)\n",
    "patlik = fp.paternity_array(progeny, mothers, adults, mu=0.0015, missing_parents=0.01, integration='partial')\n",
    "sc = fp.sibship_clustering(patlik)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A very useful tool is the `accuracy` subfunction for `sibshipCluster` objects.\n",
    "When the paternity and sibship structure are know (seldom the case in real life, but true for simulated data) this returns an array of handy information about the analysis:\n",
    "\n",
    "0. Binary indiciator for whether the true partition was included in the sample of partitions.\n",
    "1. Difference in log likelihood for the maximum likelihood partition identified and the true partition. Positive values indicate that the ML partition had greater support than the true partition.\n",
    "2. Posterior probability of the true number of families.\n",
    "3. Mean probabilities that a pair of true full sibs are identified as full sibs.\n",
    "4. Mean probabilities that a pair of true half sibs are identified as half sibs.\n",
    "5. Mean probabilities that a pair of true half or full sibs are correctly assigned as such (i.e. overall accuracy of sibship reconstruction.\n",
    "6. Mean (log) probability of paternity of the true sires for those sires who had been sampled (who had non-zero probability in the paternityArray).\n",
    "7. Mean (log) probability that the sire had not been sampled for those individuals whose sire was truly absent."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 1.   , 28.68 ,  0.   ,  0.771,  0.952,  1.   ,  1.   ,  0.1  ])"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sc.accuracy(progeny, adults)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this example, accuracy is high, but the probability of a missing sire is NaN because all the sires are present, and this number of calculated only for offspring whose sire was absent.\n",
    "\n",
    "We can adjust the `paternityArray` to see how much this effects the results.\n",
    "For example, if we remove the sire of the first family (i.e. the male indexed by 1), there is a drop in the accuracy for full-sibling relationships, although half-sibling relationships are unaffected."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 1.   , 29.38 ,  0.   ,  0.771,  0.952,  1.   ,  1.   ,  0.1  ])"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "patlik.purge = 'adult_1'\n",
    "patlik.missing_parents=0.5\n",
    "sc = fp.sibship_clustering(patlik)\n",
    "sc.accuracy(progeny, adults)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In contrast, imagine we had an idea that selfing was strong. How would this affect things?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 1.   , 28.68 ,  0.   ,  0.771,  0.952,  1.   ,  1.   ,  0.1  ])"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "patlik.selfing_rate=0.5\n",
    "sc = fp.sibship_clustering(patlik)\n",
    "sc.accuracy(progeny, adults)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The results are identical to the unmodified case; FAPS has correctly identifed the correct partition structure in spite of the (incorrect) strong prior for high selfing."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Automation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It can be tedious to put together your own simulation for every analysis.\n",
    "FAPS has an automated function that repeatedly creates genotype data, clusters into siblings and calls the `accuracy` function.\n",
    "You can supply lists of variables and it will evaluate each combination. There are a lot of possible inputs, so have a look at the help page using `fp.make_power?`.\n",
    "\n",
    "For example, this code creates four families of five full siblings with a genotyping error rate of 0.0015.\n",
    "It considers 30, 40 and 50 loci for 100, 250 or 500 candidate fathers.\n",
    "Each parameter combination is replicated 10 times.\n",
    "In reality you would want to do more than this; I have found that results tend to asymptote with 300 simulations."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "10 of each parameter combination will be performed.\n",
      "Simulating arrays with multiple number of loci: [30, 40, 50].\n",
      "Drawing allele frequencies between 0.25 and 0.5.\n",
      "Simulating adult populations of multiple sizes: [100, 250, 500].\n",
      "Simulating 4 families of 5 offspring.\n",
      "0% of per-locus genotypes will be removed at random.\n",
      "0.15% of alleles will be mutated at random.\n",
      "Input error rates taken as the real error rates.\n",
      "No candidates to be removed.\n",
      "Proportion missing canidates set to 0.01.\n",
      "Self-fertilisation rate of 0.\n",
      "Performing 1000 Monte Carlo draws for sibship inference.\n",
      "\n",
      "Parameters set. Beginning simulations on Wed Aug 18 11:10:47 2021.\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "100%|██████████| 10/10 [00:12<00:00,  1.26s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Simulations completed after 0.21 minutes.\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\n"
     ]
    }
   ],
   "source": [
    "# Common simulation parameters\n",
    "r            = 10 # number of replicates\n",
    "nloci        = [30,40,50] # number of loci\n",
    "allele_freqs = [0.25, 0.5] # draw allele frequencies \n",
    "nadults      = [100,250,500] # size of the adults population\n",
    "mu           = 0.0015 #genotype error rates\n",
    "sires        = 4\n",
    "offspring    = 5\n",
    "\n",
    "np.random.seed(614)\n",
    "eventab = fp.make_power(\n",
    "    replicates = r, \n",
    "    nloci = nloci,\n",
    "    allele_freqs = allele_freqs,\n",
    "    candidates = nadults,\n",
    "    sires = sires,\n",
    "    offspring = offspring, \n",
    "    missing_loci=0,\n",
    "    mu_real = mu, \n",
    "    unsampled_input=0.01\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For convenience, `make_power` provides a summary of the input parameters.\n",
    "This can be turned off by setting `verbose` to `False`.\n",
    "Similarly, the progress bar can be removed by setting `progress` to `False`.\n",
    "This bar uses iPython widgets, and probably won't work outside of iPython, so it may be necessary to turn them off."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The results of make_power are basically the output from the `accuracy` function we saw before, but include information on simulation parameters, and the time taken to create the `paternityArray` and `sibshipCluster` objects. View them by inspecting `eventab`."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Arguments to set up the population work much like those to create `genotypeArrays`, and are quite flexible.\n",
    "Have a look into the help file (run `make_power?` in Python) for more.\n",
    "You can also take a look at the [simulations in support of the main FAPS paper](http://localhost:8889/notebooks/manuscript_faps/analysis/A.%20majus%20data%20for%202012.ipynb), which considered a range of contrasting demographic scenarios; the example above is adapted from there."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Error rates and missing candidates are important topics to get a handle on.\n",
    "We can estimate these parameters (e.g. by genotyping some individuals twice and counting how many loci are different), but we can never completely be sure how close to reality we are.\n",
    "With that in mind `make_power` allows you to simulate true values mu and the proportion of missing sires, but run the analysis with different values.\n",
    "The idea is to estimate how wrong you could be before the analysis fails.\n",
    "For example, this code would simulate the case where you thought that the error rate was 0.0015, and 5% of the candidates went unsampled, but in reality both parameters were double that amount."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "10 of each parameter combination will be performed.\n",
      "Simulating arrays with multiple number of loci: [30, 40, 50].\n",
      "Drawing allele frequencies between 0.25 and 0.5.\n",
      "Simulating adult populations of multiple sizes: [100, 250, 500].\n",
      "Simulating 4 families of 5 offspring.\n",
      "0% of per-locus genotypes will be removed at random.\n",
      "0.15% of alleles will be mutated at random.\n",
      "Genotype error rate of 0.003 will be used to construct paternity arrays.\n",
      "Removing 10.0% of the candidates at random.\n",
      "Proportion missing canidates set to 0.05.\n",
      "Self-fertilisation rate of 0.\n",
      "Performing 1000 Monte Carlo draws for sibship inference.\n",
      "\n",
      "Parameters set. Beginning simulations on Wed Aug 18 11:11:33 2021.\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "100%|██████████| 10/10 [00:12<00:00,  1.29s/it]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Simulations completed after 0.21 minutes.\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\n"
     ]
    }
   ],
   "source": [
    "fp.make_power(r, nloci, allele_freqs, nadults, sires, offspring, 0,\n",
    "           mu_input= 0.003,\n",
    "           mu_real=0.0015,\n",
    "           unsampled_real=0.1,\n",
    "           unsampled_input = 0.05);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If you want to perform downstream analysis, you can tell `make_power` to also export each `paternity_Array` and/or `sibshipCluster` object. This is done by setting `return_paternities` and `return_clusters` to `True`. For example, this code pulls out the distribution of family sizes from each `sibshipArray`, and plots it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "100%|██████████| 10/10 [00:12<00:00,  1.24s/it]\n"
     ]
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXQAAAD4CAYAAAD8Zh1EAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/MnkTPAAAACXBIWXMAAAsTAAALEwEAmpwYAAAeg0lEQVR4nO3dfZRUd53n8fe3H6qhqwl0NZ0nCIEEzMioMUlLfNqYaHSIMwvuqDNEZ0zUXdY1OI7u7CyOezierOfsqLOuMyOjg5pRc1TyMBu3d8TFhyQ+jUnoKIkSxHQIChiTah6Sbhr6ie/+UfeSoqimq7ur6t669/M6pw/1cKn7PZfiw4/fvd/fNXdHREQaX1PUBYiISHUo0EVEEkKBLiKSEAp0EZGEUKCLiCRES1Q7XrhwoS9dujSq3YuINKSHH354wN27y70XWaAvXbqUvr6+qHYvItKQzOxXk72nKRcRkYRQoIuIJIQCXUQkIRToIiIJoUAXEUkIBbqISEIo0EVEEkKB3mAGT4xxz08PRF2GiMSQAr3B9D7yGz5wxyP8+tBw1KWISMwo0BvM08+NAJAfGom4EhGJGwV6gxkIgvzwsdGIKxGRuFGgN5iBwTDQNUIXkdMp0BvM8yP0sYgrEZG4UaA3mPyQRugiUp4CvcEMDBbmzg9pDl1ESijQG8ixkXGOj00AOikqImdSoDeQgaJLFY8o0EWkhAK9geSDK1xy2YymXETkDAr0BhKO0C87b56mXETkDAr0BpIfKoT4ZefPY3h0ghPBfLqICCjQG0rYVHTpuR2AToyKyOkqCnQzW21me8ys38w2lnn/f5nZzuDnl2Z2tOqVCvmhEXLZDOfOawMU6CJyupapNjCzZmAz8HrgALDDzHrd/bFwG3f/QNH27wOuqEGtqTcwOMLCjgxd2Qyga9FF5HSVjNBXAf3uvtfdR4GtwNqzbH8j8LVqFCenGxgaYWFHG7kg0NUtKiLFKgn0RcD+oucHgtfOYGYXA8uAeyd5f72Z9ZlZXz6fn26tqTcwNMrCjja6suGUi9ZzEZHnVfuk6Drgbncve/mFu29x9x537+nu7q7yrpMvPzhC97w25s1pobnJNEIXkdNUEugHgYuKni8OXitnHZpuqYmw7X9hRxtNTUZne0YnRUXkNJUE+g5ghZktM7MMhdDuLd3IzH4H6AR+XN0SBZ5vKlrYUZg/78pmODSkQBeR500Z6O4+DmwAtgO7gTvdfZeZ3Wpma4o2XQdsdXevTanpdirQg0sWc9kMR4YV6CLyvCkvWwRw923AtpLXNpU8/0j1ypJS4Tou3R3PB/ru3z4XZUkiEjPqFG0QYdt/d9EIXXPoIlJMgd4gBopWWgx/PTo8xvjEySjLEpEYUaA3iIGhETrbW2ltLvyRdQUnR48e17XoIlKgQG8Q4TXooc72sFtU0y4iUqBAbxBh23/o1HouunRRRAIK9AYRtv2Hch0aoYvI6RToDaJ0yuXUAl26Fl1EAgr0BlDc9h86NYeuKRcRCSjQG0Bp2z9Aa3MT58xp0QJdInKKAr0BlLb9h7o62nSTCxE5RYHeAErb/kNaz0VEiinQG0Bp23+os10rLorI8xToDaC07T/UpfVcRKSIAr0BlLb9h3IdhSkXrVgsIqBAbwil16CHurIZxiacwZHxCKoSkbhRoDeA0rb/kK5FF5FiCvQGUNr2Hwrb/3XpoohAhYFuZqvNbI+Z9ZvZxkm2+SMze8zMdpnZV6tbZrpNNkIPF+jSiVERgQpuQWdmzcBm4PXAAWCHmfW6+2NF26wAPgS8yt2PmNm5tSo4bY6NjDM8OlF2Dj2ccjmiQBcRKhuhrwL63X2vu48CW4G1Jdv8B2Czux8BcPdnqltmepVr+w91acpFRIpUEuiLgP1Fzw8ErxV7AfACM/uRmT1gZqvLfZCZrTezPjPry+fzM6s4ZSZr+wdoz7Qwp7VJ67mICFC9k6ItwArgWuBG4HNmtqB0I3ff4u497t7T3d1dpV0nW34w6BItM4cO0JXVei4iUlBJoB8ELip6vjh4rdgBoNfdx9z9SeCXFAJeZikfjNDLzaEDdGZbNYcuIkBlgb4DWGFmy8wsA6wDeku2+TqF0TlmtpDCFMze6pWZXpO1/Ydy2TZd5SIiQAWB7u7jwAZgO7AbuNPdd5nZrWa2JthsO3DIzB4D7gP+i7sfqlXRaTJZ23+oK5vRlIuIABVctgjg7tuAbSWvbSp67MAHgx+posmuQQ/ltECXiATUKRpzk63jEsplMwyPTnBibKKOVYlIHCnQY26ytv9QTt2iIhJQoMdcJVMuoEAXEQV6rIVt/wvnlb/CBZ5fz0UnRkVEgR5jYZfoZE1FAJ1ZreciIgUK9Bg7W9t/SCN0EQkp0GNsqrZ/gHPmtNLcZFrPRUQU6HH2/EqLkwd6U5PR2a5r0UVEgR5r+aDtv6vM0rnFctlWBbqIKNDjbKq2/5C6RUUEFOixNtU16CEtoSsioECPtam6REMaoYsIKNBjbap1XEKd2QzPHh9jfOJkHaoSkbhSoMdY5VMuGdzh6PGxOlQlInGlQI+p4dGp2/5DWs9FRECBHlsDQVNRpSN0gENDCnSRNFOgx1R+6AQw+b1Ei51az2VYgS6SZhUFupmtNrM9ZtZvZhvLvH+zmeXNbGfw8++rX2q6VNL2H9J6LiICFdyCzsyagc3A64EDwA4z63X3x0o2vcPdN9SgxlSqpO0/FI7QD2vKRSTVKhmhrwL63X2vu48CW4G1tS1LwkCfqu0foLW5iXPmtGiBLpGUqyTQFwH7i54fCF4r9WYze9TM7jazi8p9kJmtN7M+M+vL5/MzKDc98oOVtf2HctkMh4d12aJImlXrpOj/BZa6+0uAbwNfKreRu29x9x537+nu7q7SrpOp0mvQQ4VuUY3QRdKskkA/CBSPuBcHr53i7ofcPUyTzwNXVae89Kq07T+Uy7bpskWRlKsk0HcAK8xsmZllgHVAb/EGZnZB0dM1wO7qlZhOA0MjZ71TUakureciknpTXuXi7uNmtgHYDjQDt7n7LjO7Fehz917gz8xsDTAOHAZurmHNqZAfHKnoksVQZzbDkeFR3B0zq2FlIhJXUwY6gLtvA7aVvLap6PGHgA9Vt7T0mk7bf6grm2FswhkcGeecOa01rE5E4kqdojE0nbb/UE7XooukngI9hvLBNeiVtP2Hch3qFhVJOwV6DIX3Ep3OHHquPVjPRYEukloK9BiaTtt/SEvoiogCPYam0/Yf6tKUi0jqKdBjaGBoem3/AHNbm2lraVK3qEiKKdBjKD84vbZ/ADMLmou0notIWinQY2i6bf+hXIfWcxFJMwV6DE237T+Uy7bppKhIiinQY2hgmm3/oVx7q06KiqSYAj1mhkfHOTbNtv9QLtum69BFUkyBHjMzafsPdXVkODY6wYmxiWqXJSINQIEeM6fa/mcy5aLmIpFUU6DHzMAM1nEJdbYr0EXSTIEeM+E6LjOdcgEFukhaKdBjZiZt/yFNuYikmwI9ZgaGRlgwzbb/UFdW67mIpFlFqWFmq81sj5n1m9nGs2z3ZjNzM+upXonpMjA4OqMTogDnzGmlucl06aJISk0Z6GbWDGwGbgBWAjea2coy280D3g88WO0i0yQ/NP11XEJNTUanmotEUquSEfoqoN/d97r7KLAVWFtmu/8OfAw4UcX6Umembf+hXFbruYikVSWBvgjYX/T8QPDaKWZ2JXCRu3/jbB9kZuvNrM/M+vL5/LSLTYOBwREWzuCEaKgQ6Bqhi6TRrE+KmlkT8EngP0+1rbtvcfced+/p7u6e7a4TJ2z7n8k16CEFukh6VRLoB4GLip4vDl4LzQNeBNxvZvuAlwO9OjE6fbNp+w8p0EXSq5JA3wGsMLNlZpYB1gG94Zvu/qy7L3T3pe6+FHgAWOPufTWpOMFm0/YfymXbOHp8jImTXq2yRKRBTBno7j4ObAC2A7uBO919l5ndamZral1gmszk5tClurIZ3OHIsEbpImnTUslG7r4N2Fby2qZJtr129mWlU9j2P5s59M6guejIsZnd9UhEGpc6RWNkNm3/IXWLiqSXAj1GZtP2H9J6LiLppUCPkYHB2U+TaIQukl4K9BjJD83sXqLFFrQ/P4cuIumiQI+R2bb9A2Rampg3p0VTLiIppECPkdm2/Ye6shlNuYikkAI9JsK2/2pcaqgFukTSSYEeE2Hb/2yuQQ8VAn1s1p8jIo1FgR4T1Wj7D2mELpJOCvSYqEbbfyiXbePwsVHctZ6LSJoo0GPiVKDPq85J0bEJZ3BkfNafJSKNQ4EeE+E6Ll3Z2Y/Qi9dzEZH0UKDHRNj2n2mZ/R+JukVF0kmBHhPVaPsPnVrPZUiBLpImCvSYGBiqTlMRaIEukbRSoMdEfmiE7nlzqvJZpwJdN7kQSRUFekxUq+0foD3TTFtLk0boIilTUaCb2Woz22Nm/Wa2scz77zGzn5nZTjP7oZmtrH6pyXV8dKJqbf8AZlZYz0Vz6CKpMmWgm1kzsBm4AVgJ3FgmsL/q7i9295cCHwc+We1Ckyy8Br0abf+hTnWLiqROJSP0VUC/u+9191FgK7C2eAN3f67oaRZQi+I0PDNYvbb/UC6b4fCw1nMRSZNKbhK9CNhf9PwAcHXpRmZ2C/BBIAO8ttwHmdl6YD3AkiVLpltrYlWz7T/Ulc2w79Cxqn2eiMRf1U6Kuvtmd78U+K/Af5tkmy3u3uPuPd3d3dXadcOrZtt/KJdt03XoIilTSaAfBC4qer44eG0yW4E3zaKm1AmXzq1G238ol23l2OgEJ8YmqvaZIhJvlQT6DmCFmS0zswywDugt3sDMVhQ9/X3g8eqVmHz5oRNVa/sP5YJ/HI7oWnSR1JhyDt3dx81sA7AdaAZuc/ddZnYr0OfuvcAGM7seGAOOADfVsuikqWbbfyhsLjo0NMoF8+dW9bNFJJ4qOSmKu28DtpW8tqno8furXFeqVLPtP9TVofZ/kbRRp2gMDFSx7T/U2a5AF0kbBXoM5KvY9h/q0gJdIqmjQI9Ytdv+Q/PnttLcZAp0kRRRoEdsoIo3hy7W1GR0trfqJhciKaJAj1i+Buu4hDrbtZ6LSJoo0CMW3ku02lMuULh08cgxrecikhYK9IjVou0/1NWR4ZBG6CKpoUCPWC3a/kO5bEYnRUVSRIEesYGhkaq3/Ydy7RmOHh9j4qRWMxZJAwV6xArXoFd/dA6FEbo7HNV6LiKpoECPWC3a/kO54B8KTbuIpIMCPWKFQK/NCD3sFtW16CLpoECP2MDQaE2uQQet5yKSNgr0CB0fnWBoZLx2I3StuCiSKgr0CNWq7T+kEbpIuijQI5SvYVMRQKaliXlzWhToIimhQI/QwGA4Qq/uWujFctmMToqKpERFgW5mq81sj5n1m9nGMu9/0MweM7NHzey7ZnZx9UtNnlqP0CFcz0WBLpIGUwa6mTUDm4EbgJXAjWa2smSznwI97v4S4G7g49UuNIlq2fYf6tIIXSQ1KhmhrwL63X2vu48CW4G1xRu4+33uPhw8fQBYXN0yk2lgaIT5c2vT9h8qrOeiBbpE0qCSJFkE7C96fiB4bTLvBr5Z7g0zW29mfWbWl8/nK68yoQr3Eq3d6BygM1igy13ruYgkXVWHhmb2J0AP8Ily77v7Fnfvcfee7u7uau66IdXiXqKlurIZxiacoZHxmu5HRKJXSaAfBC4qer44eO00ZnY98GFgjbvr//gVqGXbfyiX1XouImlRSaDvAFaY2TIzywDrgN7iDczsCuAfKYT5M9UvM5kGhkZrHuhaz0UkPaYMdHcfBzYA24HdwJ3uvsvMbjWzNcFmnwA6gLvMbKeZ9U7ycRII2/7rMYcOcHhIgS6SdC2VbOTu24BtJa9tKnp8fZXrSrxat/2HwhH6Ya2JLpJ46hSNSD2aiqBw2SJoDl0kDRToEQnb/ms9h96eaaatpUmBLpICCvSIDARz2rWeQzezwnoumkMXSTwFekTywQi9lm3/oVw2wxHNoYskngI9IvVo+w9pxUWRdFCgR6SWN4cupfVcRNJBgR6ReqzjEsplM7oOXSQFFOgRKazjUp9A78pmODY6wYmxibrsT0SioUCPSD3a/kPhei46MSqSbA0Z6GGXZaM6MVaftv9QLtsKoEsXRRKu4QL9M/c/wXWfuJ9nh8eiLmXG8oP1afsPacVFkXRouEC/9rJuBkfG+dKP90VdyozVq+0/FLb/a8pFJNkaLtBfeME5XP/Cc7ntR09yrEFv2lCvtv/QqSV0NeUikmgNF+gAt1y3nKPDY3zlwV9FXcqM1KvtPzR/bitNpikXkaRryEC/Ykknr1rexed+8GRDXopXz7Z/gKYmo7Nd3aIiSdeQgQ6FUXp+cIS7+vZPvXHM1LPtP5TLZjiiQBdJtIYN9Fdc0sWVSxbw2e/tZWziZNTlTEs92/5DhfZ/BbpIklUU6Ga22sz2mFm/mW0s8/41ZvYTMxs3s7dUv8yyNbHhtcs5ePQ4X//pGfesjrV6tv2HCgt0Nfb1+yJydlMGupk1A5uBG4CVwI1mtrJks18DNwNfrXaBZ3PdZeey8oJz+Mz9TzBx0uu561mpZ9t/SCN0keSrZIS+Cuh3973uPgpsBdYWb+Du+9z9UaCucx9mxi3XLWfvwDG++fOn6rnrWaln23+oK5vh6PGxhvqHT0Smp5JAXwQUn3k8ELw2bWa23sz6zKwvn8/P5CPOsPpF53Npd5bN9z2Be/zDqt5t/6FcNoM7HFVzkUhi1fWkqLtvcfced+/p7u6uymc2NxnvvXY5u596jnt/8UxVPrOW6t32H+rUzaJFEq+SQD8IXFT0fHHwWmyseemFLO6cy9/f2x/7UXq92/5D4TXvuhZdJLkqCfQdwAozW2ZmGWAd0FvbsqantbmJ97zmUnbuP8q/PnEo6nLOqt5t/6FT67ko0EUSa8pAd/dxYAOwHdgN3Onuu8zsVjNbA2BmLzOzA8BbgX80s121LLqct1y1mHPntfHpe/vrveuKuTvf2f00AOefM6eu++4KrnvXCF0kuVoq2cjdtwHbSl7bVPR4B4WpmMjMaW1m/TWX8NFv7ObhXx3hqos7oyynrL/97uPc2XeA//iaSzi3zoG+oL2wJrrm0EWSq2E7Rct529VL6GxvZfN98Rul3/7jfXzqO4/z1qsWs3H179R9/20tzcxra1GgiyRYogK9PdPCu161jHt/8Qy7fvNs1OWc8i+P/oZNvbu4/oXn8T/+8MWYWSR15DrUXCSSZIkKdIB3vHIp89pa+If7noi6FAB++PgAH7hjJz0Xd/Lpt11BS3N0h1zdoiLJlrhAnz+3lT99xcVs+/lT9D8zFGktj+w/yvrb+7i0u4PP3/Qy5rQ2R1pPTkvoiiRa4gId4N2vXkZbSxOfuT+6UfoT+SHe+cUd5LIZvvyuVcyf2xpZLaHCCF0LdIkkVSIDvaujjbetupiv7zzI/sPDdd//b589wTu+8BBNBre/++q6X9EymVxHhiPHxmLffCUiM5PIQAdYf80lNJvx2e/Vd5R+dHiUd9z2IM8eH+OL71zFsoXZuu7/bLqyGUYnTjLUoPdiFZGzS2ygnz9/Dm++ajF39R3g6edO1GWfw6PjvOuLO9g3MMyWd1zFixbNr8t+K9XZrvVcRJIssYEO8J9ecykT7nzu+3trvq+xiZO89ys/Yef+o/zdjS/llZcurPk+p0vdoiLJluhAX9LVzprLL+QrD/66pqPSkyedv7z7Ue7fk+ejb3oxq190Qc32NRu5YIEurecikkyJDnSA9157KcfHJvinHz1Zk893dz76jd3c89OD/MUbXsDbrl5Sk/1UQ1dWI3SRJEt8oK84bx6rf/d8vviv+3juxFjVP/8z33uC2370JDe/cim3XLe86p9fTVoTXSTZEh/oALdct5zBE+Pc/uNfVfVztz70az7+//aw5vIL2fQHKyNr6a9UNtNMpqVJgS6SUKkI9Bcvns9rXtDNbT98kuOjE1X5zO27fstf3fMzrnlBN3/z1stpaop3mEPhHqxdav8XSaxUBDrA+167nEPHRvnaQ7+e9Wc9sPcQ7/vaT3nJ4gV89k+uJNPSOIdR67mIJFdF66EnQc/SHFcvy7Hl+3t5+8uX0NZS2boqzx4f4+CR4xw8epyDR4Y5ePQ4Wx/az5JcO/9088tozzTWIcxltZ6LSFI1VhrN0obXLudPv/AQ//zwQd529RLcnfzQSFFgn/nrYElXZaaliZUXnMM/vP3KUycZG0kum2HfoWNRlyEiNVBRoJvZauBvgWbg8+7+1yXvtwFfBq4CDgF/7O77qlvq7L16+UIuXzyfv/7mbj73g70cPHqc0fGTp20zb04LixbMZXHnXK5elmNR51wWLWgPfp3Lwo5M7E9+nk0uW1jPRUSSZ8pAN7NmYDPweuAAsMPMet39saLN3g0ccfflZrYO+Bjwx7UoeDbMjL964wv5m2/t4dx5c3j9yvNYtKAQ1Is6Cz/nzIl+VcRayrVnGBoZZ2R8ouJpJxFpDJWM0FcB/e6+F8DMtgJrgeJAXwt8JHh8N/BpMzOP4bJ+V1/SxV3veWXUZUQmF7T/3/CpH9DcAFfmiCTRn71uBf/28gur/rmVBPoiYH/R8wPA1ZNt4+7jZvYs0AUMFG9kZuuB9QBLlsS3ozLJrr3sXN700gsZnTg59cYiUhO1uj9CXU+KuvsWYAtAT09P7EbvabBowVw+te6KqMsQkRqo5ALqg8BFRc8XB6+V3cbMWoD5FE6OiohInVQS6DuAFWa2zMwywDqgt2SbXuCm4PFbgHvjOH8uIpJkU065BHPiG4DtFC5bvM3dd5nZrUCfu/cCXwBuN7N+4DCF0BcRkTqqaA7d3bcB20pe21T0+ATw1uqWJiIi09E4i5CIiMhZKdBFRBJCgS4ikhAKdBGRhLCori40szww01sILaSkCzVmVN/sqL7Zi3uNqm/mLnb37nJvRBbos2Fmfe7eE3Udk1F9s6P6Zi/uNaq+2tCUi4hIQijQRUQSolEDfUvUBUxB9c2O6pu9uNeo+mqgIefQRUTkTI06QhcRkRIKdBGRhIh1oJvZajPbY2b9ZraxzPttZnZH8P6DZra0jrVdZGb3mdljZrbLzN5fZptrzexZM9sZ/Gwq91k1rHGfmf0s2HdfmffNzP4uOH6PmtmVdaztsqLjstPMnjOzPy/Zpu7Hz8xuM7NnzOznRa/lzOzbZvZ48GvnJL/3pmCbx83spnLb1KC2T5jZL4I/v3vMbMEkv/es34Ua1/gRMztY9Of4xkl+71n/vtewvjuKattnZjsn+b11OYaz4u6x/KGwVO8TwCVABngEWFmyzXuBzwaP1wF31LG+C4Arg8fzgF+Wqe9a4F8iPIb7gIVnef+NwDcBA14OPBjhn/VvKTRMRHr8gGuAK4GfF732cWBj8Hgj8LEyvy8H7A1+7Qwed9ahtjcALcHjj5WrrZLvQo1r/AjwFxV8B876971W9ZW8/z+BTVEew9n8xHmEfurm1O4+CoQ3py62FvhS8Phu4HVmVpc7H7v7U+7+k+DxILCbwr1VG8la4Mte8ACwwMwuiKCO1wFPuPtMO4erxt2/T2FN/2LF37MvAW8q81t/D/i2ux929yPAt4HVta7N3b/l7uPB0wco3FEsMpMcv0pU8vd91s5WX5AdfwR8rdr7rZc4B3q5m1OXBuZpN6cGwptT11Uw1XMF8GCZt19hZo+Y2TfN7HfrWxkOfMvMHg5u0F2qkmNcD+uY/C9RlMcvdJ67PxU8/i1wXplt4nAs30Xhf1zlTPVdqLUNwbTQbZNMWcXh+P0b4Gl3f3yS96M+hlOKc6A3BDPrAP4Z+HN3f67k7Z9QmEa4HPh74Ot1Lu/V7n4lcANwi5ldU+f9T8kKtzVcA9xV5u2oj98ZvPB/79hd62tmHwbGga9MskmU34XPAJcCLwWeojCtEUc3cvbReez/PsU50GN/c2oza6UQ5l9x9/9d+r67P+fuQ8HjbUCrmS2sV33ufjD49RngHgr/rS1WyTGutRuAn7j706VvRH38ijwdTkUFvz5TZpvIjqWZ3Qz8AfD24B+cM1TwXagZd3/a3Sfc/STwuUn2Hel3MciPPwTumGybKI9hpeIc6LG+OXUw3/YFYLe7f3KSbc4P5/TNbBWF412Xf3DMLGtm88LHFE6e/bxks17gHcHVLi8Hni2aWqiXSUdFUR6/EsXfs5uA/1Nmm+3AG8ysM5hSeEPwWk2Z2WrgL4E17j48yTaVfBdqWWPxeZl/N8m+K/n7XkvXA79w9wPl3oz6GFYs6rOyZ/uhcBXGLymc/f5w8NqtFL68AHMo/Fe9H3gIuKSOtb2awn+9HwV2Bj9vBN4DvCfYZgOwi8IZ+weAV9axvkuC/T4S1BAev+L6DNgcHN+fAT11/vPNUgjo+UWvRXr8KPzj8hQwRmEe990Uzst8F3gc+A6QC7btAT5f9HvfFXwX+4F31qm2fgpzz+F3MLzq60Jg29m+C3U8frcH369HKYT0BaU1Bs/P+Ptej/qC178Yfu+Kto3kGM7mR63/IiIJEecpFxERmQYFuohIQijQRUQSQoEuIpIQCnQRkYRQoIuIJIQCXUQkIf4/kSDUFuhkkzUAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "eventab, evenclusters = fp.make_power(\n",
    "    replicates = r, \n",
    "    nloci = nloci,\n",
    "    allele_freqs = allele_freqs,\n",
    "    candidates = nadults,\n",
    "    sires = sires,\n",
    "    offspring = offspring, \n",
    "    missing_loci=0,\n",
    "    mu_real = mu, \n",
    "    unsampled_input=0.01,\n",
    "    return_clusters=True,\n",
    "    verbose=False\n",
    ")\n",
    "even_famsizes = np.array([evenclusters[i].family_size() for i in range(len(evenclusters))])\n",
    "\n",
    "plt.plot(even_famsizes.mean(0))\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Custom simulations"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Once you are familiar with the basic building blocks for generating data and running analysis, creating your own simulations if largely a case of setting up combinations of parameters, and looping over them.\n",
    "Given the vast array of possible scenarios you could want to simulate, it is impossible to be comprehensive here, so it must suffice to given a couple of examples for inspiration."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Likelihood for missing sires"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this example is was interested in the performance of the likelihood estimator for a sire being absent.\n",
    "This is the likelihood of generating the offspring genotype if paternal alleles come from population allele frequencies.\n",
    "This is what the attribute `lik_abset` in a `paternityArray` tells you.\n",
    "\n",
    "Ideally this likelihood should be below the likelihood of paternity for the true sire, but higher than that of the other candidates. I suspected this would not be the case when minor allele frequency is low and there are many candidates.\n",
    "\n",
    "This cell sets up the simulation. I'm considering 50 loci, and mu=0.0015, but varying sample size and allele frequency."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Common simulation parameters\n",
    "nreps        = 10 # number of replicates\n",
    "nloci        = [50] # number of loci\n",
    "allele_freqs = [0.1, 0.2, 0.3, 0.4, 0.5] # draw allele frequencies \n",
    "nadults      = [10, 100, 250, 500, 750, 1000] # size of the adults population\n",
    "mu_list      = [0.0015] #genotype error rates\n",
    "nsims        = nreps * len(nloci) * len(allele_freqs) * len(nadults) * len(mu_list) # total number of simulations to run\n",
    "dt           = np.zeros([nsims, 7]) # empty array to store data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This cell simulates genotype data and clusters the offspring into full sibships.\n",
    "The code pulls out the mean probability that each sire is absent, and the rank of the likelihood for a missing sire among the likelihoods of paternity for the candidates."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Beginning simulations on Wed Aug 18 11:12:20 2021.\n",
      "Completed in 0.03 hours.\n"
     ]
    }
   ],
   "source": [
    "t0 = time()\n",
    "counter = 0\n",
    "\n",
    "print(\"Beginning simulations on {}.\".format(asctime(localtime(time()) )))\n",
    "\n",
    "for r in range(nreps):\n",
    "    for l in range(len(nloci)):\n",
    "        for a in range(len(allele_freqs)):\n",
    "            for n in range(len(nadults)):\n",
    "                for m in range(len(mu_list)):\n",
    "                    af = np.repeat(allele_freqs[a], nloci[l])\n",
    "                    adults  = fp.make_parents(nadults[n], af)\n",
    "                    progeny = fp.make_offspring(adults, 100)\n",
    "                    mi      = progeny.parent_index('m', adults.names) # maternal index\n",
    "                    mothers = adults.subset(mi)\n",
    "                    patlik  = fp.paternity_array(progeny, mothers, adults, mu_list[m], missing_parents=0.01)\n",
    "                    # Find the rank of the missing term within the array.\n",
    "                    rank    = [np.where(np.sort(patlik.prob_array()[i]) == patlik.prob_array()[i,-1])[0][0] for i in range(progeny.size)]\n",
    "                    rank    = np.array(rank).mean() / nadults[n]\n",
    "                    # get the posterior probabilty fir the missing term.\n",
    "                    prob_misisng = np.exp(patlik.prob_array()[:, -1]).mean()\n",
    "                    #export data\n",
    "                    dt[counter] = np.array([r, nloci[l], allele_freqs[a], nadults[n], mu_list[m], rank, prob_misisng])\n",
    "                    # update counters\n",
    "                    counter += 1\n",
    "\n",
    "print(\"Completed in {} hours.\".format(round((time() - t0)/3600,2)))\n",
    "\n",
    "head = ['rep', 'nloci', 'allele_freqs', 'nadults', 'mu', 'rank', 'prob_missing']\n",
    "dt = pd.DataFrame(dt, columns=head)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}