## Migrating from SAS to Mathematica

M2SLink was specifically designed to allow SAS users to leave SAS behind and switch to Mathematica.

Is transitioning from SAS to Mathematica a viable option for your company?

For many SAS sites, the investment that has been made to store data in SAS datasets and to write SAS programs to analyze data and generate reports is significant. Therefore, the prospect of replacing SAS with some other statistical analysis system is daunting, to say the least.

Why should you consider migrating from SAS to Mathematica?

(1) You can reduce your operating expenses. You have to pay an annual fee to SAS Institute to renew your SAS license if you want to continue to use the SAS System. And you have to pay an additional annual fee for every module you use beyond “base” SAS. If you fail to pay your renewal fees within the grace period associated with your license, the SAS System stops working.

To be sure, Wolfram Research also has a subscription/license model which requires an annual renewal fee if you want to continue using whatever is the current release of Mathematica. But the cost of the Mathematica renewal fee is likely to be far less than the cost of your SAS license renewal. And, unlike the SAS System, if you decide not to renew your subscription, your copy of Mathematica does not stop working. It is worth pointing out that Wolfram Research also allows you to purchase Mathematica without a subscription. In that case, you own your copy of Mathematica free and clear without any obligation to pay a renewal fee. This is not true of the SAS System. You never own your copy of SAS; it order to keep using SAS year after year, you must pay a license renewal fee.

(2) You can work within a framework that is state of the art . The user interface provided by SAS has not been updated since the early 1990s. Its three-part window architecture, consisting of a “Code” window in which you input your SAS code, a “Log” window in which SAS reports the errors found when it tried to run your code, and its “Output” window that provides a table of numbers representing the result of running your code, has remained unchanged for decades. Now SAS does offer an additional product, Enterprise Guide, which purports to provide you with a graphical user interface to SAS. But Enterprise Guide costs extra and does not really provide a very high level of abstraction over SAS itself. In other words, if you are not already familiar with the quirks of the SAS programming language, you probably won’t be able to get very far using Enterprise Guide.

On the other hand, the notebook interface provided by Mathematica is exceptional. It is light-years beyond the Code window/Log window/Output window interface that SAS supports. Within Mathematica's notebook interface, you can intermingle your data and your code, go back and forth between the two, making edits as necessary. This makes your work flow vastly superior to what you experience using SAS. Wolfram clearly cares about the user experience and Mathematica's notebook interface is constantly being updated and improved. It even supports natural language input.

Mathematica is in a class by itself when it comes to symbolic computation. The IML procedure in SAS is the closest thing SAS provides to symbolic computation. But the IML procedure is not nearly as powerful or as comprehensive as Mathematica in its support for matrix manipulation. And while SAS's IML procedure is limited to symbolic computation involving matrices , Mathematica supports symbolic computation in all areas of mathematics. Sometimes, even a statistician needs to differentiate or integrate a function; while this is very easy to do in Mathematica, it's impossible using SAS alone.

The Wolfram Language, upon which Mathematica is based, is vastly superior to the SAS programming language in terms of overall design. The SAS System is really several programming languages in one. For example, in order to manipulate your data, you must write code that conforms to the so-called "Data Step", which has a peculiar and in many ways counter-intuitive logic. And its grammar is very different from the grammar that must be observed when writing code that runs a SAS procedure such as PROC REG or PROC QLIM. Last but not least, if you want to customize a report, you will have to master yet another SAS grammar that serves this purpose.

Mathematica has always been and continues to be on the cutting edge when it comes to user interaction and the graphical display of data.

Even if you are able to master the obscure part of the SAS programming language that allows you to customize a report, once your report has been rendered, it is “DOA” — dead on arrival. SAS does not allow you to interact with the report in any way other than to print it. SAS provides no user interface that allows you to dynamically modify the content of the report itself by, say, moving your mouse. This is because the reports that SAS generates are just HTML files that contain static pictures.

By contrast, Mathematica allows you to interact with your data and your reports in a highly dynamic, user-friendly way. For example, there are several 3D plots in Mathematica that allow you to rotate and spin the plot simply by moving your mouse. This takes visual analysis to a higher level that does not exist in SAS. Furthermore, using Mathematica's Manipulate function, you can create output with which you can interact dynamically by manipulating controls that you have specified to be part of the output itself. For example, you can create a bubble plot that includes a slider control. The slider control can be connected to the code that is responsible for computing your model so that when you move the slider control with your mouse, your model is recomputed based on the slider control's current value and the bubble plot is updated.

(3) Finding developers to maintain and enhance your code is a lot easier. The SAS programming language is idiosyncratic and out-dated. It does not conform to or support any modern programming methodology, such as functional or object-oriented programming. Worse still, the grammar for SAS procedures is inconsistent from one procedure to the next. The upshot is, it can be very difficult to find developers who are familiar with the SAS programming language and who are interested in programming in its antiquated framework.

By contrast, Mathematica supports all modern programming methodologies. Although the preferred methodology is functional, you can write programs using object-oriented, declarative, and procedural paradigms. Today's computer science and statistics students who know R, Java, and/or Python can leverage the skills they have already acquired to write code in Mathematica. This is not so with respect to SAS. Nothing other than a course in SAS programming itself can prepare a new hire for the task of writing SAS code. And in most cases, your new hire will need to take several courses in SAS programming before they become proficient.

The Wolfram Language is extremely well designed and consistent from one function to the next. Unlike SAS, you will never find a situation in Mathematica where an option with the exact same name does totally different things depending on the function in which it is used.

How does M2SLink help you to migrate from SAS to Mathematica?

As long as your data is contained in a SAS dataset, SAS holds your data captive and you are forced to continue to license SAS in order to access your data.

With its m2sImport function, M2SLink allows you to migrate your data from SAS to Mathematica in a very efficient and relatively painless manner. Although you could convert your SAS datasets into SAS Transport files and then import them into Mathematica, such a process is tedious and valuable metadata may be lost in translation. Worse still, important conversions involving dates won't be performed and the data that results will need further manipulation to ensure that it is correct. By contrast, M2SLink preserves the metadata associated with your SAS dataset and makes the necessary conversions for you automatically, so that date-time values are accurately converted from the SAS "zero hour" to Mathematica's "zero hour". Once you have moved your data out of the proprietary SAS dataset file format and into Mathematica, you will be free to use any tool you like to analyze your data. We believe that in most cases, you will find that the statistical functionality provided by Mathematica is capable of duplicating most if not all the analyses you currently perform using SAS. Furthermore, with Mathematica, you will be able to easily manipulate your data in ways that are extremely difficult to duplicate in SAS using its Data Step.

As you migrate from SAS to Mathematica, there will undoubtedly be various SAS programs that you will need to convert into Mathematica functions. With its m2sSubmit function, M2SLink makes it easy to run your old SAS programs from within a Mathematica notebook and to obtain the data content of the reports that SAS generates. Thus, you don’t have to go back and forth between SAS and Mathematica in order to make the transition. Instead, you can work from your Mathematica notebook environment and interact with SAS to test your Mathematica code against your SAS code efficiently and effectively.

When is migrating from SAS to Mathematica not a viable option?

In earlier versions of the product, Mathematica's support for statistical analysis was very limited.

Unfortunately, this created an impression, which lingers to this day, that Mathematica is not really suitable for serious work involving statistics. However, this is no longer true. Over the past 15 years, Wolfram Research has significantly extended Mathematica's statistical analysis capabilities. Using Mathematica, it is now possible to construct highly sophisticated models using all the well-known statistical distributions. As such, Mathematica now rivals SAS and other statistical applications such as SPSS, Minitab, and JMP.

Here is a link to a technical note at the Wolfram Research website which covers this topic in great detail:

Summary of Mathematica's Support for Statistics

Nevertheless, it may be the case that your company depends on certain statistical procedures in SAS that are not duplicated in Mathematica. If so, then you obviously cannot completely eliminate your dependency on SAS. However it may still be worth migrating to Mathematica those parts of your operation which do not have a critical dependency on SAS. This will allow you to decrease the number of users allocated to your SAS license, and thereby lower expenses. For example, if there is a particular analysis or report that can only be produced by SAS, you could convert your SAS license to a single user license and still have the analysis and report generated. The SAS license user could then use M2SLink's m2sImport function to import the SAS dataset into their Mathematica notebook. By default, when a SAS dataset is imported using the m2sImport function, it is encapsulated in a Mathematica Dataset object. At that point, the SAS license user could share the Mathematica Dataset object in their notebook with the other employees who are using Mathematica and who need to have access to the data. The same is true for any data contained in a SAS generated report. And if one of the Mathematica users needs to modify the data that came from SAS, that individual can provide the modified data to the SAS license user, who can then use M2SLink's m2sExport function to convert the data in their notebook into a SAS dataset. And then the SAS user can open the modified dataset in SAS.

But what if you need to analyze a dataset that contains, say, a billion observations? Can you use Mathematica for this purpose?

The answer is Yes. A few years ago, SAS made a big splash by introducing SAS Viya, its high-performance solution for "big data". However, one fact that many industry observers overlooked is that Wolfram has been quietly supporting grid computing for many years. Wolfram's high performance solution is called gridMathematica.

Here is a link to the Wolfram Research website which presents Wolfram's high performance solution:

Wolfram's High Performance Solution

But can you transfer a "really big" SAS dataset into gridMathematica using M2SLink?

Once again, the answer is Yes, provided the SAS dataset does not contain more than 2 billion observations. Unfortunately, if you have a SAS dataset that contains over 2 billion observations, you cannot directly import your dataset into Mathematica in one single step using M2SLink. This is because M2SLink uses Wolfram's WSTP protocol to transfer your data between SAS and Mathematica, and at the present time, the lists supported by WSTP cannot exceed 2 billion in length. (We are hopeful that this restriction will be lifted at some point in the near future.) Nevertheless, it is possible to import your data into gridMathematica using M2SLink provided you are willing to break down the original SAS dataset into smaller datasets, which can then be individually imported. Once your SAS data has been imported in manageable pieces using M2SLink into gridMathematica, it can easily be recombined into a single list or Mathematica Dataset object. To be sure, breaking down the original SAS dataset into pieces that can be consumed by M2SLink will take a certain amount of time and effort. But presumably, you will only need to do it once.

If you are interested in transitioning from SAS to Mathematica and have any questions, please contact us.

The opinions expressed in this white paper are those of Harper Corditt Software alone. Although Harper Corditt Software is a development partner with Wolfram Research, the reader should not interpret the statements made above as representing the views of Wolfram Research or any of its employees.

M2SLink for Mathematica © 2020 Harper Corditt Software. All rights reserved.

Mathematica® is a registered trademark of Wolfram Research.

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.