Data Desk® FAQ

If you are a registered user of Data Desk, you are entitled to free technical support. We recommend that you consult your manuals before contacting us, as many of the answers to the questions we receive can be found there. You may also want to peruse the FAQs below, which cover a variety of questions from general to technical.

Contact Us to submit an issue to the Data Desk Technical Support Team

What operating systems does Data Desk support?

Data Desk RP (8.3) for Windows can be used with computers running Win 7 up to Win 11: 64-bit ONLY. Data Desk RP (8.3) for Mac runs on OS X 10.9 up to 13.0 (Ventura).

Norton SONAR Quarantined Data Desk 8.exe

There is a way to bring Data Desk 8.exe out of quarantine:

  1.  In the Security History window, in the Quarantine view, select the item that you want to restore. – Click Options. – In the Threat Detected window, click Restore.
  2. In the case of non-viral threats, you can use the Restore & exclude this file option. This option returns the selected Quarantine item to its original location without repairing it and excludes the item from being detected in the future scans.
  3. In the Quarantine Restore window, click Yes.
  4. In the case of non-viral threats, you can use the option that is available in this window to exclude the security risk. Norton AntiVirus does not detect the security risks that you exclude in future scans. – Click Close.

Mac Unidentified Developer Error

Apple has introduced new security features. You can add Data Desk 8.1 as an exception to this rule by holding control and clicking the Data file. From there click open. A pop up will appear asking if you are sure you want to open this app, click open and Data Desk 8.1 will be added to your exceptions list for your system’s security settings. You can read more about this from Apple support:

Nonlinear Model Error ID=3

If you receive an error when you try using the menu option Calc > Nonlinear Models, you need to install the library folder, which can be downloaded here on our User’s Forum along with a step by step fix.

Can Data Desk work with really large files?

Data Desk has a theoretical limit of about 2 billion cases. We don’t know of anyone who has hit that limit but we do know of users who have analyzed datasets with over a million cases. The most important factors affecting performance are the speed of your computer’s processor and the amount of available RAM. Most computers should be able to analyze comfortably 25,000 datapoints. You can use the Data Desk demo to quickly check how Data Desk performs with large datafiles on your system. Launch the Data Desk demo and choose Generate Random Number form the Manip menu. Type the number of desired variables in the first field – about 5 is usually sufficient – and the number of cases in the second field – go ahead try a big number like 20,000 or 30,000. Press the OK button. Data Desk generates the random variables and opens the window holding the variable’s icons. Now use the variables to create some plots or tables. Try a scatterplot, a regression or even a 3D rotating plot.

Can Data Desk open data saved in a spreadsheet file?

As of Data Desk 8.1, we do support opening XLSX files. Older versions of files can be brought to this format from within Excel.

Why are all my two-digit date calculations different?

Before the year 2000, the Macintosh interpreted all two-digit dates with a year after 11 as years in the 20th century, and two-digit dates before 11 as years in the 21st century. Now, however, the two-digit cut-off year is 91, which may throw off date calculations on data collected/entered before 1991. The best solution to this problem is to work with dates stored with four-digit years. One easy way to do this is to create a new derived variable and type in: IF Right(your date variable, 2) < cut-off year THEN Left(var, Len(var)-2) & "20" & Right(var, 2) ELSE Left(var, Len(var)-2) & "19" & Right(var, 2)

How do I do a subset analysis in Data Desk?

(Chapter 13 in the Data Desk Handbook) Subset analysis is performed with user-defined indicator variables called Selector variables. Selector variables may be assigned in a variety of ways. The most direct method of applying a selector variable is to drag the selector variable into the analysis you want to restrict. All Data Desk analysis tables and some plots allow selector variables to be dragged into them. Another way to apply a selector variable to a display or table is to select the icon of the selector variable and choose {Selector} Assign Selector from the plot or table’s HyperView menu. When you assign a selector, take care that only the selector variable’s icon is selected. Selector variables can be assigned using a Selector button. Select the selector variable’s icon and choose {Special > Selector} Assign. Data Desk creates a selector button, which appears in the lower left corner of the Data Desk window. Initially, it is turned on (highlighted). Click the Selector button to toggle it off and on. When the Selector button is highlighted (on), all Data Desk commands operate only on the cases marked as 1 in the selector variable. After a command is executed, the button turns off. Press the button again to highlight it and invoke selection for the next command.

How do I print multiple plots and tables on the same page?

(Chapter 15 in the Data Desk Handbook) Layout Windows can be used to print multiple plot and tables on the same page. To create a new layout window select {Data > New} Layout. Data Desk creates a new layout window and opens it. To place pictures of plots or tables in the layout window drag the icon (or icon alias) of the plot or table window into the layout window. Alternatively, choose the {Edit} Copy Window command to copy a picture of the window, click on the layout window and choose {Edit} Paste. If you drag the icon of an unopened window into a layout window, Data Desk creates a button that links to that window — when the button is pressed the window opens. To reposition a picture in the layout window, click on it and drag it where you would like it. Plots in layout windows are transparent so you can overlay several plots. When you add a picture to a layout window, the date and time are automatically recorded as part of the title. Such documentation helps to track the history of your analysis and provides a type of audit trail. You may hide this title by clicking on the HyperView menu triangle that is shown at the left of the title and choosing ‘Hide Title’. If you choose ‘Remove Link’ it will remove the entire link and make it a static picture. However, this command cannot be reversed! Data Desk also lets you to type or paste text into layout windows. When a layout window is frontmost, pressing any letter, number or symbol key on the keyboard or pasting text creates a text box within the layout window. This allows you to make personal comments on what you have observed during each stage of your analysis.

How do I generate an interaction plot in Data Desk?

When the selected factor in the Results panel of the Linear Model design view is a two-way interaction, Data Desk offers an interaction plot from a HyperView menu for the title of the Expected Cell Means table. First, select the interaction term from the Result for factor panel. Next, click to the right of the phrase: “Expected Cell Means of:” and choose Interaction Plot of ‘dependent variable’ by ‘interaction term’ from the context-sensitive HyperView menu. An interaction plot is a dotplot of the expected cell means by the categories of one of the main effects in the interaction, with lines added by group to connect points according to the other term in the interaction.

How can I calculate Frequency Counts for crossed data?

Select the two variables that you would like crossed. Choose {Manip > Transform > Misc} Cross. A new variable will be created titled ‘Cross’. Select this variable and go to choose Frequency Breakdowns from the Calc menu. This will give you the total count for all of the possible combinations of the two variables.

How do I resize and move Note, Picture and Button windows?

To move any Data Desk window that does not have a title bar (for example, Note, Picture Button and Socket windows), hold down the Option key on Mac, (Ctrl key on Windows), click anywhere on the window and drag the window where you would like to place it. To resize any Data Desk window that does not have a title bar, hold the option key on Mac (Ctrl key on Windows), click and drag the bottom right corner of the window to the size you would like.

How do I apply a selector to a nonlinear curve fit?

Because Data Desk’s nonlinear modeling command uses templates programmed in the internal Action Programming Language, selector and group buttons have no effect analyses computed using this command. Several additional steps need to restrict nonlinear models to a subset of points. If you are entering your own function (Custom), follow the steps below. 1.) First, you should create your selector variable. For this example call it ‘Selector’. 2.) Open the Nonlinear Model custom template. Press the “Change Loss Function” button. A new window titled “Loss Function” will open. Within that window, you can edit the text in the “Loss Fn” window. Change the text to the following: ssq(‘resids’ for ‘Selector’=1) This tells the ssq computation to restrict itself to the 1’s that are in the selector variable. 3) Open the results and choose Show Plot Info from the top scatterplot’s HyperView menu. Drag Selector into the selector line. If you are using one of the pre-built Nonlinear models, there are a couple of extra steps. 1.) Create your selector variable. (For this example name it ‘Selector’) 2.) Press the Open Results button. 3.) Click on the word ‘sumsq‘ in the Coefficients & Sum of Squares window. Choose ‘Locate Sumsq’ from the HyperView menu. Data Desk finds and selects the derived variable that computes the sum of squared residual. Open this derived variable. 4.) Edit the text after “for” to say: (numeric(‘Y’) and ‘Selector’ = 1) For example the Exponential Fit template should look like the following: ssq(‘ypred’-‘Y’ for (numeric(‘Y’) and ‘selector’ = 1)) 5.) Open the results and choose Show Plot Info from the top scatterplot’s HyperView menu. Drag Selector into the selector line.

How do I set up a Repeated Measures Analysis in Data Desk?

There are two options available for computing a repeated measures analysis. One is using the multivariate ANOVA, the other is a nested design form. To compute the analysis using the multivariate ANOVA option, your repeated observations must be entered as separate variables. For example, day 1, day 2, day 3, and so on. You need to have a variable that records treatment type for each subject in the study where each subject is a row in your dataset. Select your observations as Y variables and your treatment as an X variable. Choose Calc > Linear Models. Data Desk opens the Linear Model design view. Click on the button next to “Type of analysis:” that says MANOVA and select Repeated Measures from the pop-down menu. To compute and view the results click the arrow next “Results” to open the Results panel inside the Linear Models design view. Repeated measures can also be computed using a nested form. You must have one variable that records observations, a variable that records the corresponding treatment, a subject variable, and the repeat variable, which names the repeats. (See pages 29/8 and 29/9 for a schematic representation and an example.) Select the observations variable as Y and the other three variables as X. Choose Calc > Linear Models. Data Desk opens the Linear Models design view. In the Factors panel, nest the Subject factor inside the Treatment factor. Next open the Interactions panel (click on the arrow next to “Custom Interactions”) and specify Treatments*Repeats interaction term. To compute and view the results click the arrow next “Results” to open the Results panel inside the Linear Models design view. Advantages of multivariate repeated measures: * Easier to specify. * Faster and smaller; may be able to compute under memory limits when nested form cannot. * Offers dotplots of responses in repeat order with lines connecting subjects; a useful diagnostic display. Disadvantages of multivariate repeated measures: * Less flexible; can’t omit interactions. * Can’t compute expected cell means, coefficients, or post-hoc tests. * Cases that miss even one Repeat are omitted from the analysis. * One Repeat factor only. Advantages of the nested calculations: * Greater flexibility; can omit interactions. * Can compute expected cell means, coefficients, and post-hoc tests for all terms. * Missing observations are omitted only for the repeat on which they are missing; the subject can be kept in the analysis. * Multiple Repeat factors are possible. Disadvantages of the nested calculations: * More complex to specify; may require data manipulation to put variables in the correct form. * Slower and larger. May have difficulty completing the calculation for large files without a large amount of memory.

How do I compute cross tabs or contingency tables when my data is tabulated, i.e. Male 13 Female 15?

Sometimes data come to us that have already been summarized. So, for example, instead of getting a file with a row for each individual, you might receive a file with two rows – one row holding the number of males and the other the number of females. Data Desk always uses data that have not been summarized for its analyses. The Replicate Y by X command coverts summarized data to individual records so that Data Desk can use the data in other analyses Replicate Y by X creates a new variable which repeats categories in Y the number of times listed in X. For example, suppose that you had two variables, each with two cases — one called sex and one called replicates. The variable sex contains the text string ‘male’ in the first case and the text string ‘female’ in the second case. The replicates variable hold the value 13 in the first case and 15 in the second case indicating 13 males and 15 females. If you select sex as y and replicates as x and choose Replicate Y by X from the Manip menu, Data Desk creates a new variable called sex:replicates, holding 28 cases: 13 cases of ‘male’ followed by 15 cases of ‘female’. This variable can be used to create bar charts, frequency tables and as factors in linear models.

How are the date and time functions different from the Mac versions to the Windows version?

Date and Time Functions are calculated with the base date of Jan 1, 1904. For example, if you typed “Days(1/1/93)” it would return a value of 32509. Which equals the total number of days from January 1, 1904, to January 1, 1993. There are differences in the date and time functions between Mac and Windows regarding the interpretation of two-digit year notation. (i.e. 93 instead of 1993) On the Macintosh, any two-digit year between 11 and 99 will be translated to 19XX, but any two-digit year between 00 and 10 will be translated to 20XX. Therefore, dates before 1/1/1911 and after 12/31/2009 must include the century digits (2011 as opposed to 11). On the Windows, it depends what oleaut32.dll file you have in your system folder. With one of the oleaut32.dll files, the results are the same as on the Mac. If you type “Days(1/1/30)” and it returns the value of 2558 then you have one that is comparable to the Mac scenario. If your system uses the other oleaut32.dll file, any two-digit year between 30 and 99 will be translated to be 19XX, but any two-digit year between 00 and 29 is now assumed to be 20XX. Therefore, dates before 1/1/1931 and after 12/31/2030 must include the century digits. If you type “Days(1/1/30)” and it returns the value of 39083, then you have the later of the two oleaut32.dll files. Unfortunately, there is no easy way to determine which oleaut32.dll file your system is using. The best way is to experiment with several dates and see what values are returned.

How can I import a fixed format file into Data Desk?

To open a fixed format file launch Data Desk and choose ” Open Datafile…” from the File menu. Use the Open dialog to find and open the fixed-format file. Press the “Use Fixed Format” button. Data Desk displays the first group of characters of the top row of the file. Type in the number of characters in the first variable, enter the variable name press the Next button. Continue this until you have defined all of your variables and then press the Done button.

How can I copy selected cases?

To copy selected cases from a Data Desk relation, first open the variables that you would like copied for the specified cases. Be sure there is an editing sequence number in each variable you wish to include with the copy. The editing sequence is located at the top of the scroll bar on the right of each variable editing window. If there is a gray box instead of a number, click on the gray box. A number appears. This number represents the order that the variable columns will be in when you copy them. (Note: It does not matter what order the actual variable windows appear in on the screen, the order that they are copied in is determined by the editing sequence number.) After you have assigned the editing sequence number for each variable, select the cases by dragging and highlighting the cases in the editing windows or by selecting points in a plot or table. Choose Copy from the Edit menu. The cases are now ready to be pasted either into another window in Data Desk or into another program.

Why does it seem that the query tool doesn’t work on derived variables?

The Query tool displays the text or value in the frontmost variable window for the selected case. To have the Query tool display the values computed by a derived variable, open the derived variable and choose the “Show Numbers” command from its HyperView menu. Be sure the Show Numbers window is open and is the frontmost variable window, select the query tool from the Tool palette and click on the a point in any Data Desk plot that displays individual points.

How do I recode variables? For example, 1 and 2 = ‘small’, 3 = ‘medium’, 4 = ‘large’.

There are two ways that you can do this within Data Desk. The first is by using Derived variables. Create a derived variable (Data>New>Derived Variable) and type the following expression: If ‘var’ = 1 or ‘var’ = 2 then “small” else if ‘var’=3 then “medium” else “large” The second method requires a graphical selection of the cases you want to recode. First, select ‘var’ choose Bar Charts from the Plot menu. Open ‘var”s editing window (double-click on the variable ‘var’). Select the pointer tool from the tools palette and highlight the cases for categories 1 and 2 by clicking on the bars corresponding to those categories in the plot (hold the Shift key down to select the second bar). Click on the title bar of ‘var’ you are recoding, choose Replace from the Edit menu and type “small” in the Replace dialog. All the selected cases change from “1” or “2” to “small”. Repeat the same process for all other categories you want to recode.

What is the difference between selectors and hot selectors?

The difference between selectors and hot selectors is subtle, but important. Selectors are static 0/1 indicator variables. Once you define the 0/1 code for each case, it does not change unless you explicitly change the numbers in the variable or the derived variable expression, if your selector is based on a derived variable. Hot Selectors are 0/1 indicator variables that are dynamic and are based on the selection state of the cases in the relation. The 0/1 coding for each case is determined by the selection state of that particular case at that particular time – highlighted cases are coded with a 1, unhighlighted are coded with a 0. You can change the selection state and, therefore the 0/1 code, by selecting new points in any plot or table. With Hot selectors you can say “I want to see that regression, but only for the males” by simply assigning a HotSet selector to the regression and selecting the male bar in a gender bar chart or frequency table. Than you can say “Let me see that regression, but this time only use the female points” and you can accomplish the by clicking on the female bar of the bar chart. The general rule is that any plot or table that has a HotSet Selector assigned to it will recompute or redisplay anytime the selection state has been changed. The selection state can be changed by selecting points in plots, like a histogram or scatterplot, or a table, like a frequency table or contingency table.

How do I append data to an existing datafile?

To append data to an existing data file, open the data file that holds your existing data and choose Import from the File menu. Select the file that has your additional observations. (This could be another Data Desk file or a text file.) Follow the usual steps for importing data. Data Desk creates a new relation and places the new data in that relation. Verify that the new data have the same number of variables as the original data file. Select the existing data variables as Y’s and the additional data variables as X’s and choose Parallel Append from the Manip menu. Data Desk creates a new relation labeled Parallel Append which holds the new appended variables. The appended variables use the same names as the original variables (the ones selected as Y’s). Hint: If you don’t see the Parallel Append command under the Manip menu, make sure that you have the same number of Y and X variables selected.

What is the difference between a Selector and a Group button?

Selector buttons work with selector variables to restrict analyses to a specified subset of points. Group buttons work with variables that hold multiple categories and result in parallel analyses for each category in the categorical variable.

What is the difference between a Relation and a Folder?

Most datasets are rectangular. There are variables (usually represented as columns) and cases (usually represented as rows). Each case has a value recorded for each variable. The recorded value may be a value defined as “missing” rather than a number or a category name. Because each case has a value for each variable and each variable has a value at each case, the array of data can be shown as a rectangular table of values as in a spreadsheet. In Data Desk this rectangular structure is known as a Relation. A Folder helps to organize variables or results into groups so that you can deal with them easily. Several icons may belong together for the following reasons: they describe the same individuals or circumstances; they contain related quantities; you plan to use them together in analysis; or you want to group them together to clean up the desktop. In Data Desk any kind of icons can be grouped into folders for any of these reasons.

How do I compare means of two groups if my data contains one variable that holds observations and another that holds groups?

The Data Desk commands that compute hypothesis tests and confidence intervals for differences between sample means require that the samples being compared reside in separate variables. Many datasets use category variables to differentiate between samples, storing the measured values in one variable and the categories in another variable. To convert data stored using a category variable to separate variables for each category, select the variable holding the measurements as Y, the variable holding the groups as X and choose Split into Variables by Group from the Manip menu. Data Desk creates a new variable and a new relations one for each category in the categorical variable.

Can data desk compare data sampled at irregular intervals?

Hello Data Desk User, You ask an interesting question. Your question is really about data structures. Datadesk, unlike most other statistics packages, has relational database functions, so it does understand the differences among the kinds of data you mention and has tools that can deal with them. However, you must think very clearly about what you want. First, a definition: A relation is a data structure in which all variables are about the same cases. You can think of it as a rectangular data table in which rows are the cases and columns are the variables. For example, quarterly data, such as GDP are in a relation in which each case is an economic quarter. S&P data are in a different relation with daily data. All statistical methods that deal with multiple variables require variables that are part of the same relation. So ultimately, you’ll have to arrange for your variables to be in a single relation. However, there are certainly times when variables in different relations can be related to each other. In database terms, that requires a relational lookup. So, for example, although S&P500 is daily, we do know which quarter each day is in. So it is possible to look up in the Quarterly relation to get the GDP for that quarter. If you do that for each day, the resulting variable will be in the Daily relation but will repeat the same GDP value (in effect) for each day in that quarter. If that is what you are looking to do, then Datadesk does have the functions you need. Look at the Help files (or documentation) for Derived Variables and within that for the Relational Functions. You’ll find functions such as GetCase(y, x) which takes x as a list of case numbers (for example quarters) and returns the case in y at that case value. Lookup(y, x) returns the case number of a case of y for which y = k. Using these and other functions like them, you should be able to look up the GDP value in the quarterly relation by knowing the day number in the Daily relation (provided you can write the function that specifies the quarter number from the day number.) A derived variable that uses these operations could be a member of the Daily relation but would hold the GDP value for the quarter to which each day belongs. I always have to debug my use of these operations. A convenient way to do that is within a scratchpad, where you can type an expression and evaluate parts of it as you put it together.
Data Description, Inc.

PO Box 4555

Ithaca, NY 14850


Buy Data Desk



Data Desk


Try Data Desk

Tech Specs



Explain Files


Data Desk Help