Chapter 1 Managing

Before analysing your postural data, you need to structure your data. This section of the book illustrates how to structure postural data to facilitate data analysis.

1.1 Text output of AMTI Netforce

In AMTI software, there are two ways of exporting the data collected in a session:

As a text format (.txt): a file compatible with any data software (excel, Numbers, R, SPSS, etc.)
As a bsf format: a file compatible with the BioAnalysis software

The bsf file appears to include Moments, Forces, Time flow, and CoP data. However, the text file only contains bare information regarding Moments and Forces.

Note: Interestingly, the two formats display slightly different outputs. To date, it is not sure why those small discrepancies occur. The most likely hypotheses are: 1) the sampling used for the bsf file is not perfectly synchronised with the sampling used for the text file, 2) the bsf file encoding involve some pre-processing steps (yet to be known).

In any case, because of transparency and flexibility, most users in academia would prefer using the raw .txt files.

Most user cases of the force plates in academia would involve: one or more experimental sessions where a participant will perform a task (passively or actively) while standing on the force plates. A classic task in postural studies would be, for instance, to passively view stimuli displayed on a screen while standing on the force plate (for a meta-analysis, see Monéger et al., 2024 (preprint)). For the purpose of this book, we will look at a generic case:

1.2 A generic example

D’api and De Reinette are two jumbologists, scientists interested on the effect of planes on personality, that would like to see if some personality traits are more naturally drawn toward planes than others. They measured participants’ personality traits, and then assessed participants’ body sway toward and away from planes to measure approach vs. avoidant spontaneous reactions. To this aim, they study body sway in passive viewing tasks where people watch pictures of planes vs. a control condition consisting in pictures of bikes (presented in a random order). The protocol is 100 seconds long:

it starts with 20 seconds of watching random neutral pictures of household items,
then 8 trials follow – each trial starts with a fixation cross (2s) – then pictures appear (8s)

They decided to use a 100Hz sample frequency. They collected data from 6 participants because D’Api and De Reinette, in addition to being into pseudo-scientific research, are also unaware of statistical power problems in the literature (see Monéger et al., 2024 [preprint])

When looking at the minimalistic output of a text file exported from NetForce (see Figure 1.), several challenges appear:

what can those columns be?
How to merge the 200 files collected into one data file?
How to add a Time column that will facilitate data wrangling?
How to compute CoP-X and CoP-Y from the moments and forces?
How to annotate the data with periods of interest (i.e., 30s neutral pictures, then the 6 trials including fixation crosses + pictures, and the final 1s blank?

Figure 1. Text output exported from Netforce

##         V1       V2         V3       V4       V5       V6
## 1 7.24049, 7.22547, 790.05556, 5.47499, 7.68121, -0.16553
## 2 7.46415, 7.30759, 789.63462, 5.41001, 7.61477, -0.14419
## 3 7.80475, 7.35250, 789.30544, 5.34504, 7.54834, -0.12477
## 4 7.60925, 7.46016, 789.32800, 5.34504, 7.54834, -0.16260
## 5 7.49666, 6.98660, 788.61981, 5.34434, 7.41616, -0.02562
## 6 7.42719, 6.95055, 788.25881, 5.27937, 7.34973, -0.07609

1.2.1 Merge_PosData: Merge your postural file

The Merge_PosData command hits four birds with one stone. However, it is crucial to note that, in contrast to most other commands of the package, it requires AMTI Netforce text outputs. Using postural outputs collected and exported from different softwares might prove tricky: you would have to be sure that the file(s) have the same structure as the files exported from AMTI Netforce (i.e., empty headers, and same organisation of the columns’ order: Mx, My, Mz, Fx, Fy, Fz).

Tip: the package include typical data file from the AMTI NetForce software: Postural_DataA, Postural_DataB, Postural_DataC, Postural_DataD, Postural_DataE

The Merge_PosData:

Names the columns corresponding to moments and forces in an AMTI output
Merges all postural files grouped into a folder on their computer
Adds a time column that will describe the protocol’s time flow
Automatically computes CoP-X and CoP-Y data

To do that, users only need to state: the path to the postural files (they should be grouped in a single folder on your computer), the protocol duration, and the sample rate that was used in the protocol.

D’Api and De Reinette programmed their protocol using NetForce. Their protocol is 100s long, and they used a sample rate of 100Hz.

path_to_data <- system.file("extdata", package = "BalanceMate") #Find subdirectory of Example data in the original .txt format exported from AMTI Netforce software

Data <- Merge_PosData(path_to_data, SampleRate = 100, SessionDuration = 100) # Input correct arguments: in this example, the protocol was 331seconds long and the sample rate was 100Hz

head(Data)

##        Fx      Fy       Fz      Mx      My       Mz Time          file_name      CoP_X
## 1 7.24049 7.22547 790.0556 5.47499 7.68121 -0.16553 0.01 Postural_DataA.txt -0.9722367
## 2 7.46415 7.30759 789.6346 5.41001 7.61477 -0.14419 0.02 Postural_DataA.txt -0.9643410
## 3 7.80475 7.35250 789.3054 5.34504 7.54834 -0.12477 0.03 Postural_DataA.txt -0.9563269
## 4 7.60925 7.46016 789.3280 5.34504 7.54834 -0.16260 0.04 Postural_DataA.txt -0.9562995
## 5 7.49666 6.98660 788.6198 5.34434 7.41616 -0.02562 0.05 Postural_DataA.txt -0.9403974
## 6 7.42719 6.95055 788.2588 5.27937 7.34973 -0.07609 0.06 Postural_DataA.txt -0.9324006
##       CoP_Y
## 1 0.6929880
## 2 0.6851283
## 3 0.6771827
## 4 0.6771634
## 5 0.6776827
## 6 0.6697508

Note: the SessionDuration is an optionnal argument: the command will handle the time referencing automatically on the basis of the length of your data and the sample rate. If your sessions are of unequal duration, this is useful. However, if you do know the duration of your sessions, it can be very useful to indicate the SessionDuration as an argument, although not mandatory, this will test the structure of your data: if the data is not of the expected length based on the expected duration of the session, it will throw an error to make you aware that the sessions are not of the expected duration. This can be very useful. For this reason, the author recommend using this argument when your sessions are of equal durations.

The resulting output is definitely neater.

1.2.2 Time_StampeR: Annotate your data file with periods of interest

D’Api and De Reinette are satisfied with their new data file. However, some data wrangling remains to be done to be able to analyse their data. In particular, they would like to know which data point corresponds to the training phase (initial neutral pictures display), which one corresponds to fixation crosses, and most importantly, which ones are their critical periods where participants where watching planes and bikes.

D’Api and De Reinette could use the dplyr package to tidy and organise everything. However, they might not be familiar with data wrangling using the tidyverse and might prefer an easy option specifically designed for their purpose.

The Time_StampeR was created to answer this need. To use it, you will need to identify the exact time course of the protocol to know when data should be cut to separate two periods of interest. You will also need to determine the best label to use to name these periods of interest.

In the case of D’Api and De Reinette, they identified the following periods of interest:

If protocols are complex, this operation might require a good old pen and a piece of paper (e.g., Figure 3)

Figure 3. D’Api and De Reinette protocol.

Once the labels and cuts are identified, we can run the command in a straightforward manner.

Note: there should be x labels and x-1 cuts.

# Identify the time cuts in your protocol: 
cuts<-c(20, 
                     22, 
                     30, 
                     32, 
                     40, 
                     42, 
                     50, 
                     52, 
                     60, 
                     62, 
                     70, 
                     72, 
                     80, 
                     82,
                     90, 
                     92)

# Label the periods:
Label = c("Training", 
          "Fix", 
          "Trial_1", 
          "Fix", 
          "Trial_2", 
          "Fix", 
          "Trial_3",
          "Fix", 
          "Trial_4", 
          "Fix", 
          "Trial_5", 
          "Fix", 
          "Trial_6", 
          "Fix", 
          "Trial_7", 
          "Fix", 
          "Trial_8") 


Annotated_Data <- Time_StampeR(df = Data, id_col = "file_name", sample_rate = 100, protocol_duration = 100, cuts = cuts, period_names = Label)

head(Annotated_Data)

##        Fx      Fy       Fz      Mx      My       Mz Time          file_name      CoP_X
## 1 7.24049 7.22547 790.0556 5.47499 7.68121 -0.16553 0.00 Postural_DataA.txt -0.9722367
## 2 7.46415 7.30759 789.6346 5.41001 7.61477 -0.14419 0.01 Postural_DataA.txt -0.9643410
## 3 7.80475 7.35250 789.3054 5.34504 7.54834 -0.12477 0.02 Postural_DataA.txt -0.9563269
## 4 7.60925 7.46016 789.3280 5.34504 7.54834 -0.16260 0.03 Postural_DataA.txt -0.9562995
## 5 7.49666 6.98660 788.6198 5.34434 7.41616 -0.02562 0.04 Postural_DataA.txt -0.9403974
## 6 7.42719 6.95055 788.2588 5.27937 7.34973 -0.07609 0.05 Postural_DataA.txt -0.9324006
##       CoP_Y Period_Name
## 1 0.6929880    Training
## 2 0.6851283    Training
## 3 0.6771827    Training
## 4 0.6771634    Training
## 5 0.6776827    Training
## 6 0.6697508    Training

These annotations should help D’Api and De Reinette in exploring their data. For example, from there, they can delete all data points that are not critical for their hypotheses, example:

table(Annotated_Data$Period_Name)

## 
##      Fix Training  Trial_1  Trial_2  Trial_3  Trial_4  Trial_5  Trial_6  Trial_7 
##     9600    12000     4800     4800     4800     4800     4800     4800     4800 
##  Trial_8 
##     4800

Data <- subset(Annotated_Data, Annotated_Data$Period_Name != "Training"& Annotated_Data$Period_Name != "Fix")

head(Data)

##           Fx      Fy       Fz       Mx       My       Mz  Time          file_name
## 2201 8.20202 7.84041 787.7650 10.73050 -2.10720  0.01140 22.00 Postural_DataA.txt
## 2202 8.61100 7.46362 787.4265 10.92603 -2.30759 -0.08077 22.01 Postural_DataA.txt
## 2203 8.23421 8.08361 787.4285 11.18653 -2.44155 -0.19815 22.02 Postural_DataA.txt
## 2204 8.14101 7.59821 788.5537 11.38546 -2.51290 -0.12008 22.03 Postural_DataA.txt
## 2205 8.50985 8.15720 788.1458 11.58099 -2.71330 -0.06111 22.04 Postural_DataA.txt
## 2206 8.06830 8.26632 787.7850 11.90639 -2.78034 -0.01770 22.05 Postural_DataA.txt
##          CoP_X    CoP_Y Period_Name
## 2201 0.2674909 1.362145     Trial_1
## 2202 0.2930547 1.387562     Trial_1
## 2203 0.3100663 1.420641     Trial_1
## 2204 0.3186720 1.443841     Trial_1
## 2205 0.3442637 1.469397     Trial_1
## 2206 0.3529313 1.511376     Trial_1

Now, their data should only include the critical trials.

This is all very nice, but this is still a lot of data points for them… 6 trials of 45seconds means 270 seconds per session. For 200 participants, it means a total of 54,000 seconds of observation. With a sample rate of 100Hz, the poor researchers still have to deal with 5,400,000 data points (!).

Most of the time, researchers are interested in synthetising their data into time bins, also known as epochs. Similar approach was recommended for postural analysis (see Lelard et al., 2019). Indeed, we are using a tool that measures body sway longitudinally – might as well make the most of it and use some epochs in our analysis (in contrast to using average scores per participants – what a loss in data points, when you think about it).

D’Api and De Reinette might want to study body sway averaged for each seconds of the session. To do this, they can make good use of the command Epoch_SliceR of the BalanceMate package.

1.2.3 Epoch_SliceR: Synthetise your data into time-bins

The Epoch_SliceR command allows the user to compute average scores at a specified time-bin level. Note that users familiar with the tidyverse could once again do this data-wrangling by exploiting this powerful package. Nevertheless, we assume that some researchers might not be familiar with the pipe operators and specific grammar of the tidyverse. Our package hence wants to deliver a toolbox entailing the data-wrangling involved in postural studies.

To use the Epoch_SliceR, you will need to determine the columns of your data you would like to synthetise, the sample rate in your protocol, and the time bin you want to define as your analysis unit.

Here, D’Api and De Reinette would be interested in CoP-Y and CoP-X displacements every second of the protocol.

EpochData <- Epoch_SliceR(df = Data, ID = "file_name", columns_to_synthesize = c("CoP_X", "CoP_Y"), epoch_length = 1, sample_rate=100, session_duration = 64)

head(EpochData)

##   Epoch                 ID Mean_CoP_X   SD_CoP_X Mean_CoP_Y  SD_CoP_Y      Fx      Fy
## 1     1 Postural_DataA.txt  0.7109490 0.21320523   1.743974 0.4338906 8.20202 7.84041
## 2     2 Postural_DataA.txt -0.1622987 0.18771524   1.217026 0.2031887 7.52152 3.20478
## 3     3 Postural_DataA.txt -0.1897843 0.03929402   1.054515 0.1528600 6.94358 5.06172
## 4     4 Postural_DataA.txt -1.0258581 0.54483787   1.634610 0.5449001 9.24985 5.39103
## 5     5 Postural_DataA.txt -1.7313933 0.11739986   1.878156 0.2107398 5.56441 7.42442
## 6     6 Postural_DataA.txt -1.4703300 0.08321871   1.162634 0.2663593 7.46206 5.75981
##         Fz       Mx       My       Mz Time Period_Name
## 1 787.7650 10.73050 -2.10720  0.01140   22     Trial_1
## 2 787.1718  7.60428 -2.76518 -0.04273   23     Trial_1
## 3 787.2318  8.68569  1.98998  0.03728   24     Trial_1
## 4 789.6914  6.54076  0.98998 -0.20008   25     Trial_1
## 5 788.4621 14.68819 13.82547 -0.12471   26     Trial_1
## 6 787.8079 13.64925 12.22582 -0.05026   27     Trial_1

Note: this time, session duration is 64s, because we removed the training (20s) and the fixation cross durations (2s x 8), leaving us with only the critical measures occuring during the trials (8 x 8s).

And voilà, D’Api and De Reinette now have their data formatted for a nice postural analyses. They went from an obscure headless 6-columns data file, to a fully exploitable synthetised datasets including time course, time-bins average, and CoP data.

Yet, postural data can still be quite messy. And indeed, we are studying signal that might contains some artifacts that we would like to get rid of. To this aim, the postural literature sometimes report using filters to get a better signal.

In the following chapter, we will hold our breath while diving in the joys of pre-processing signal data.