Search results
Results from the WOW.Com Content Network
random.sample(population, k) Return a k length list of unique elements chosen from the population sequence. Used for random sampling without replacement. Basically, it picks k unique random elements, a sample, from a sequence:
This question asks about getting a random(ish) sample of records on SQL Server and the answer was to use TABLESAMPLE. Is there an equivalent in Oracle 10? If there isn't, is there a standard way to get a random sample of results from a query set? For example how can one get 1,000 random rows from a query that will return millions normally?
import numpy as np chosen_idx = np.random.choice(1000, replace=False, size=50) df_trimmed = df.iloc[chosen_idx] This is of course not considering your block structure. If you want a 50 item sample from block i for example, you can do:
If you can use a pseudo-random sampling and you're on SQL Server 2005/2008, then take a look at TABLESAMPLE. For instance, an example from SQL Server 2008 / AdventureWorks 2008 which works based on rows: USE AdventureWorks2008; GO SELECT FirstName, LastName FROM Person.Person TABLESAMPLE (100 ROWS) WHERE EmailPromotion = 2;
In spherical coordinates, taking advantage of the sampling rule: phi = random(0,2pi) costheta = random(-1,1) u = random(0,1) theta = arccos( costheta ) r = R * cuberoot( u ) now you have a (r, theta, phi) group which can be transformed to (x, y, z) in the usual way. x = r * sin( theta) * cos( phi ) y = r * sin( theta) * sin( phi ) z = r * cos ...
With a nice random number generator that guaranteed no duplicates when generating m numbers in a row, an O(m) solution would be possible. Given the three assumptions, the basic idea is to generate m unique random numbers between 1 and n, and then select the rows with those keys from the table.
The second parameter passed to sample, 150, is how many random samplings you want. The square bracket slicing specifies the rows of the indices returned. Variable 'a' gets the value of the random sampling.
I just discovered that the RAND() function, while undocumented, works in BigQuery. I was able to generate a (seemingly) random sample of 10 words from the Shakespeare dataset using: SELECT word FROM (SELECT rand() as random,word FROM [publicdata:samples.shakespeare] ORDER BY random) LIMIT 10
See the function strata from the package sampling. The function selects stratified simple random sampling and gives a sample as a result. Extra two columns are added - inclusion probabilities (Prob) and strata indicator (Stratum). See the example.
I wonder if you mean that R's internal random number generator isn't up to your standards, and so using it to 'randomly' select a subset of your fancily generated pseudo-random numbers defeats their purpose. So maybe you mean you want to use your pre-generated random #'s to generate a subset of itself? Or am I being too cute about this? ;) –