TRENDING NEWS

POPULAR NEWS

In Excel I Would Like To Take A Column Of Values And Then Make A Thousand Randomized Copies Of

How do I restrict the value of a randomly generated number in Excel?

For example, to restrict the random number between 10 and 17, you can use the RANDBETWEEN(bottom, top) function, as shown:Each time you execute the function, the value will be updated.

How do I create a random examination seating arrangement in excel?

There may be a better way than this, but this is a quick an easy way to do it:Starting in cell A1, enter the students’ names in column A.In cell B1, enter this formula and copy it down for as many rows as there are students:=RAND()In cell C1, enter this formula and copy it down for as many rows as there are students:=RANK(B1,B:B)In cell D1, enter this formula and copy it down for as many rows as there are students:=INDEX(A:A,MATCH(ROW(),C:C,0))In column E, enter the seat identifiers, such as seat number, or seat row/column, or however you identify the seats.Press F9 to regenerate the random assignments. Each time you press F9, you’ll get a new set of random seat assignments, with student names in column D and your seat identifiers in column E. The same will happen whenever the sheet calculates for any reason.To use a set, you’ll want to copy it somewhere and paste values there so it doesn’t change next time the sheet calculates. Or just print it.You could take it a step further like this:Assign a defined name to each used cell in column D. The names could just be “Seat1”, “Seat2”, etc. Or if you identify the seats by row/column, the names could be “Seat_R1C2”, or however you identify the seats.(To assign a defined name to a cell, select the cell, then go to the “Name Box” to the left of the formula bar, enter the name, and press Enter. No spaces allowed in defined names.)On another sheet in the workbook, lay out a grid of cells to represent your seats. In each of them, enter a reference to the matching defined name. For example:=Seat_R1C2Now you have a seating chart you can print out, with student names shown at seat positions, which gives you a new set of random seating assignments each time you press F9. But again, you might want to have a 2nd area laid out the same way, for pasting values that won’t change the next time the sheet calculates. Or again, just print it.

Why do people say R is better than Excel? Even a simple VLOOKUP or SUMPRODUCT formula takes like ages to write in R.

It is not that difficult in R. Lets take a data framedf <- data.frame(a=c(1,2,3,4),b=c(2,3,4,5))
A sumproduct for this would bewith(df,sum(a*b))
Suppose you want to run other functions over this at the same time. Like mean of the sum just include it within a list. The output will be a list of the calculations.with(df,list(sum(a*b), mean(a+b)))
For the vlookup you could use %in% or you could use dplyr or a combination. For example consider a dataframedf <- data.frame(color = c('blue', 'blue', 'red', 'green'), a=c(1,2,3,4),b=c(2,3,4,5))
If I want to vlookup only values of column a where the color is blue, dplyr can achieve this in one line.df %>% filter(color=='blue') %>% select(a)
What if I want to look up multiple values?vlkup_values <- c('blue','red')
df %>% filter(color %in% vlkup_values) %>% select(a)
Well that was no too hard.I really don’t want to start flame wars between R or Excel or show off my R knowledge. Each language has their own domain where they shine. Even Excel has the DAX programming language with a whole host of business intelligence tool kits like Power BI or PowerPivot or PowerQuery which are much easier to do in Excel than R and is free but not open source. You can run queries on a billion row file without Excel crashing.R has several domain specific packages and data science packages like dplyr, modelr and ggplot for visualization. These packages make modelling an easy exercise.The reason where R really shines in my view though is in the field of reproducible research. Science schools teach us to think and analyse and record but they never teach us to reproduce the work we do. If you are doing an important piece of research where you need to revisit a year from now, I did find R a better option than excel. With excel if you change the details in one cell the excel file will look exactly the same as before. They cannot be version controlled reliably. R on the other hand can be integrated with versioning tools like git and so changes can never go unnoticed.In short I want to champion the thinking of reproducibility of code and its results and the version control of software as basic software requirements before one chooses a language to program in.

How can one determine if a data set is normally distributed?

Given a set of observations, how do you test whether a random variable follows normal distribution?This is a very often asked question in the field of Data Mining and Machine Learning. The question is answered through Hypothesis testing.You can find a lot of awesome information here about tests of normality but I will try and explain the basis in my answer.I have used Chi Squared, Kolmogorov-Smirnov and Shapiro-Wilk in the past, all these tests are available as packages in R and are possibly implemented in NumPy.The Null hypothesis is often that, the values are sampled from a Normal Distribution. So generally a two tailed Hypothesis Test is performed, and if the null hypothesis is rejected, you say with a certain confidence that the values do not come from a normal distribution. The converse is generally assumed but is, strictly speaking, not true, because in a hypothesis test (Fisher’s test) you can only reject Hypotheses.The statistic used to test the hypothesis is generally based on some general properties of a Normal Distribution. Chi Squared tests make use of the assumption that the square of a Standard Normal Distribution follows a Chi Squared Distribution. KS tests make use of the properties of the cumulative probability distribution to compare any two distributions, in this case your samples, and a normal distribution.

TRENDING NEWS