Applying Machine Learning

In the autumn of 2022, thousands of berry images were stored among the large image metadata, the machine learning model was finalized and the final project seminar was kept. Berry machine -project ended and it was a fascinating journey into practical machine learning.

The Berry machine project was financed by Interreg Nord with Natural Resources Institute Finland (LUKE) and the Norwegian Institute of Bioeconomy Research (NIBIO) participating, Frostbit Software Lab at Lapland University of Applied Sciences as a developer. The project goal was to study the possibility to use machine learning to make the current berry harvest estimation measurement process less dependent on manual labour, thus bringing up the possibility to do significantly more field measurements and open up new ways to estimate berry yields.

Current berry harvest estimation measurement is done by manually counting every single berry inside a frame created with various materials (Kilpeläinen 2016). The frame forms precisely one m² area. In the forest, five frames are installed in places with varying ground types (marjahavainnot.fi 2022). After all of this manual work, data is still not exhaustive enough to do more elaborated studies as the process is both expensive and laborious (Bohlin 2021).

The Berry machine project studied the way how to use deep learning to help count the berries and thus make the process significantly less laborious. The plan was to use deep-learning computer vision to detect berries and create a functioning prototype system with a mobile application and all required backend applications.  Only bilberry (Vaccinium Myrtillus) was required as the initial target species, but lingonberry (Vaccinium Vitis-idea) was also included to give a more comprehensive view of how the berry machine system would work across species. Both of the berries were counted in growth phases “flower”, “raw” and “ripe”.

Deep Learning Techniques

The main steps to fulfill the Berry machine project task were to train a computer deep learning model to detect berries from images. Using this computer vision, berries were counted from images and the detected berries count was compared to the number of berries counted by field measurement.

Figure 1. Artificial Neural Network.

Traditional Artificial Neural Network (ANN) is formed with input neurons, hidden layers and an output layer (Figure 1). In the case of image classification, input image pixel values are added as values in the input layer (Gogul 2017). Weight and bias values in hidden layer neurons and connections are derived in the training process. Weight and bias are then used to recalculate new values for input values when all original pixel values propagate from left to right arriving finally at the final output layer representing the result in numerical form.

Figure 2. Convolutional Neural Network.

Compared to the ANN, image classification results are significantly improved changing to breakthrough technology, Convolutional Neural Network (CNN) (Marmanis 2016). CNN architecture improves ANN architecture by adding convolution and pooling layers. New layers simplify and extract features of the image while keeping the feature position with the original image (Figure 2). Pooling layers sub-sample the original image and feature extraction layers detect more high-order features from the image. With CNN architecture, computer vision is capable of human-level or higher accuracy (Galanty 2021).

CNN architecture on its own only classifies one image as a whole. The project goal with computer vision was to detect multiple instances of berries in various growth phases from a single image. Therefore object detection architecture is required instead of image classification architecture as object detection also seeks to locate berry instances from their natural background (Liu 2018).

Figure 3. YOLO (You Only Look Once) object detection.

Object detection architecture YOLO (You Only Look Once) was used in the project. YOLO was designed to be fast, simple and accurate (Redmon 2015). The YOLO process splits the image into a grid and detects objects in each grid and predicts the bounding boxes. Bounding box results are joined with the classification probability of each grid to filter more accurate results (Redmon 2015).  The YOLO process is visualized in Figure 3. YOLO version 5 was used in the project as it was built using PyTorch framework (Solawetz 2020) which did make it easier to implement. As a bonus, YOLOv5 added data augmentation to the training process out of the box (Benjumea 2021).

From Data to Prototype

A deep-learning training dataset was formed by taking thousands of berry images. Approximately 10 000 images were collected using mostly consumer-level mobile phones. One of the ideas in the original project plan was to get closer to crowdsourcing the berry harvest measurements with an ease-of-use mobile application. Therefore training was to be done mostly using phones available to the general public.

Figure 4. Example of annotated six classes from the dataset.

After and during the dataset collection, the annotation process was initiated. The annotation process was very labour-intensive and not all images were annotated due to limits in resources. The annotation process is simply manually drawing the correct berry class on top of the image. A significant difficulty was that berries are small and especially raw berries hide well in the background. After the main dataset of annotated images was complete, additional annotation processes were initiated to balance the dataset. The balance dataset is a training dataset with a roughly equal number of instances from each class. The final dataset was formed of six classes shown with examples in Figure 4.

Figure 5. Annotated counting frame.

The density formula is the number of instances divided by area. Computer vision was used to estimate the number of berries but the surface area was missing. The field measurement counting frames were also annotated for density estimation as in Figure 5. Frame polygon was used to crop deep learning detected berries and thus all berries left in the image were inside from a one m² and 0,25 m² area. Two different sizes of count frames were used to allow comparing detection correlation differences in images taken closer and further off the ground.

Figure 6. Berry Machine tasks.

The Berry machine tasks were visualised in Figure 6. A significant amount of work was used to analyse and improve iteratively training results. Improving training results meant that more annotation was done as sometimes data was not balanced enough. The computer was not the only one with difficulties detecting berries as annotating was also difficult due to a larger number of obscure berries barely visible in images. It is likely that if personnel was directed to annotate the same images, results would vary.

The main dataset processing before the training was the process of splitting the image into smaller pieces. The YOLOv5 model used in the project had a native resolution of 640×640 pixels. Inputting a larger image meant downscaling into the native resolution. As images were taken with, for example, 4032×3024 resolution, significant loss of detail would occur which is detrimental when detecting small instances like berries. Therefore all annotated images annotated in their original resolution and all images from where berries were detected, would be split into smaller images before detection and training. A complex splitting algorithm with overlapping zoom levels was initially used, but the splitting process was reverted to a more simple split process as simpler process advantages were more numerous than using more complex methods. Once again, simplicity wins.

Training iterations were not exhaustively laborious by themselves, but with deep learning, improvements were done by modifying the dataset and the dataset pre-processing which was time-consuming. Project time was also used in post-process as training results required analysis and detection quality checks in some cases using a benchmark dataset.

Figure 7. Number of instances per class in the training dataset

The final training dataset of image pieces normalized to 640×640 pixels used for training was 41 101 images with 95 065 annotated berries. Dataset was supplemented with synthetic data created randomly adding clipped berry images with transparent backgrounds on top of various background images. Background images were mostly images taken around the campus and plain white background images. Figure 7 visualizes the final balance of the dataset. Balance was deemed sufficient based on the graph.

Figure 8. Prototype production backend architecture.

As presented in Figure 8, the project’s backend architecture was made into four separate services. The prototype backend was the main backend for the prototype mobile application. The prototype backend’s main tasks were to add login property, act as image storage for images uploaded and store location and result values. The dataset inference process service’s purpose was to batch-run the detection process to thousands of images and save results for data analysis. The resulting metadata of detected berries was used for the density estimation study. The pipeline backend’s main task was to split the image and request detection backend for each split image. After all the image slices were executed through the detection backend, the detected berries’ position in the image slice was converted to the original main image. Finally, all results were sent to the address defined in the original request. The final backend was the inference backend which ran the model detection with the image received and responded with the detection results.

Prototype application development was initiated after the first viable detection model was trained. The main architectural decision with the prototype application backend was to build the actual deep learning backend (inference backend) into a separate service. The decision was made to allow separate backend deployment on various Graphical Processing Unit (GPU) or non-GPU servers. In addition, the image split backend was developed into a separate service to minimally modify the standard prototype backend used by the prototype application. This was to make it easiest to deploy the prototype backend using an existing headless API and allow using the image split-enabled detection backend (pipeline backend) past the prototype backend presented in Figure 8. Modular design as architectural design separates concerns more reliably, but in practice, it was noted, that deployment of multiple services on each edit slowed the process significantly and thus increased complexity.

Figure 9. Prototype User Interface.

The prototype application was developed using a multiplatform Flutter framework that enables developing for both IOS, web and Android using the same codebase (GitHub.com 2022). On measurement location, the user created a new location from the UI. The location user interface is visualized in Figure 9. The location page showed the estimated berry density among other information. The target berry phase and species were defined by the most prominent berry detected from multiple images. Density was estimated as the average of all images and corrected with a linear regression formula derived from the correlation analysis. Surface area estimation for density calculation was tested with two different methods. First, estimate the area from the berry average size. The second method was to estimate the area using augmented reality (AR) libraries. AR was implemented only in Apple IOS devices.

Results

In the project, multiple different results emerged: Deep learning detection model results and density estimation results. Detection results are covered first. The results are from the last training iteration.

Figure 10. YOLOv5 confusion matrix.

In the YOLOv5 deep learning process dataset of annotated image pieces was split into three parts. Training test and validation. The training dataset was used to train the model, the test dataset was used to test the model’s progress during training and the validation dataset was to do final testing after the training was done. Validation dataset results indicated a detection rate from 83% to 93% depending on the class. Results are visualized with the confusion matrix in Figure 10. The confusion matrix depicts percentages of predicted detections matching the true values. Reading the confusion matrix in the case of class bilberry ripe detections, 1% was bilberry raw, 91% correct and 17% of the ripe bilberries detected were part of the background. According to the confusion matrix, ripe lingonberries were the most detected class while bilberry raw had significantly worst results. Overall, the results were deemed acceptable.

Figure 11. Detected berries cropped with frame bounds.

For the density estimation study, the trained model was executed to detect berries in nearly 5000 images with one m² and 0,25 m² frame annotations. Frame annotation was then used to crop detected berries and only inner berries were kept (Figure 11).

estimated density = (number of detected berries of the correct type) / (frame surface area)

Equation 1. Calculate the estimated berry density

For density calculation, the surface area was defined by frame area and the number of berries was the cropped value of deep learning detected berries of the same type as calculated in the image. Every image file path defined the target berry species, growth phase and location. This meta-information was used to filter only target classes into density estimation. Density estimation was calculated for each image with Equation 1.

The estimated density was not the final result used in the prototype. The assumption was that image will not show all berries present, but berry detection in the image correlates with the field measurements. Linear regression formula was derived using the correlation. For each class, the regression formula in the prototype was different.

Figure 12. Scatter plots of detected berries in cropped images compared to field measurement values. Colouring by location.

Estimated density and field measurement value was plotted into scatter plot for each class separately (Figure 12). Pearson correlation for each class was shown in the scatter plot in the plot title. The Scatter plot was coloured by location to visualize how location affects the relation between detected berries and the field berry count. The Scatter plot shows several points:

  • Ripe lingonberry has fewer data and all of the ripe lingonberry images are only from 2021 images. More homogenous ripe lingonberry data hints at the need for more varied data as homogenous datasets can lead to early conclusions.
  • To practically use correlation with for example linear regression to estimate density from the image, more variables need to be taken into consideration. Location colouring hints at one additional way to improve correlation by considering growth location.
  • Bilberry and lingonberry behaviour were different. With bilberry, possible causes of noise can be more lush undergrowth than lingonberry blocking camera view of the berries.
  • Lingonberry was detected as bunches, but field measurement calculated lingonberries as individual berries. The difference between berry count and bunch count is likely to increase with increased yield. Lingonberry may grow more berries in a single bunch when the berry harvest improves.
Figure 13. Scatter plots of detected berries in cropped images. Only the max berry count value is selected when images are taken from the same frame. Colouring by location.

One method of increasing correlation results was to take all the images taken from the same frame and use only the image with the most berries detected. In theory, only the image with the best angle and quality to capture most berries is used. This required taking multiple images from a single frame. Practical challenges in image quality were sun position with the camera causing difficulty to detect individual berries, camera shake and angle with more ground vegetation in the way. Using the image with the most berries was justifiable and the maximum value image did give a better correlation than the average value. With the maximum value image, correlation with the field measurement value was improved (Figure 13).

Conclusions

Detecting berries in different growth phases using consumer-level cameras and deep-learning object detection yielded successful results. Computer vision with object detection is a highly viable method of detecting difficult instances from the natural background. Invaluable information about practical machine learning processes especially concerning data pre- and post-processing was collected. In the project, large and valuable training data was created from the field measurements to the additional synthetic dataset.

Figure 14. Lush bilberry vegetation hides most of the berries.

Density estimation was as not simple as many things in nature are. A perfect example of the challenges was shown in Figure 14 with bilberry. High-yield bilberry harvest is accompanied by lush ground vegetation. This inversely lowers the number of detected berries in the image.

The project gave a more in-depth view of the ways to estimate density and how to approach the problem. The main challenges with the density estimation were berry detection, surface area estimation and density estimation with the information collected. Berry detection was deemed feasible, but the other two require more studies.

Surface area estimation was tested in the prototype with two different methods. Using detected berries as a base measurement to estimate image cover area and using the operating system’s augmented reality (AR) framework. AR was deemed the preferable method in the long term. The project prototype proved that AR can be used to measure ground area, but requires a significant amount of software development to be viable. Challenges include detecting flat plane on a practically uneven surface and handling noise in AR measurement caused by constantly moving cameras and implementing AR on both of the most common phone operating systems.

Concerning density estimation with surface area and berry detection results, more research is required. Correlation does exist but is too weak to be practical. Directing users to take multiple images and using images with the most berries detected, is one viable option to improve the correlation, but more data dimensions are needed. Berry’s growth location shows promise as one dimension to factor in.

Figure 15. Only a small fraction of real-world ML systems are composed of the ML code, as shown by the small black box in the middle. The required surrounding infrastructure is vast and complex (Sculley 2015).

With the deep-learning process, a considerable amount of experience was collected. Practical experiences in the project correlated with the argument expressed in the article “Hidden Technical Debt in Machine Learning Systems” that Machine Learning (ML) systems have a special capacity for causing technical dept (Sculley 2015).  The article also states that only a tiny fraction of the code in an ML system is for the actual training or prediction (Figure 15). Berry machine project code base supported this statement. A large amount of code was required to modify and create a dataset in the post-process. ML backend deployment with modular design, version verification and generic request and response formats also increased the codebase a significant amount. The final part often overlooked was the analysis process tools to draw conclusions and guide the iterative training process in the correct direction.

As stated by Sculley 2015, Using generic packages in a hope that it reduces the amount of coding required, often results into glue code to join generic packages together. The glue code issue led to the project developing custom packages for project purposes. Version control was one difficult aspect, as the ML system was improved by modifying the large training dataset not managed by a version control system. Thus it is difficult to duplicate results in a specific version without the correct training dataset.

Implementing a real-world ML system can be a hassle. A large number of surrounding infrastructure is required in the system. Infrastructure is not inherently negative, but a valuable lesson is to consider this requirement when creating an ML system. When implementing an ML system, obstacles differ from a normal software project. Before and during the process, developers need to be careful not to end up filling the system with anti-patterns like glue-code, pipeline jungles and abstraction dept (Sculley 2015).

In the end, the Berry machine project was a priceless experience filled with hope and excitement. The project generated numerous internships and launched two bachelor’s theses and one master’s thesis. Practical ML system experience was learned. Hopefully, we will see more ML usage in the future as the potential is tangible.

This article is partially based on author’s master thesis “Berry Density Estimation With Deep Learning: estimating Density of Bilberry and Lingonberry Harvest with Object Detection” available in address https://urn.fi/URN:NBN:fi:amk-2022120927665


References

Benjumea, Aduen, Izzedin Teeti, Fabio Cuzzolin, and Andrew Bradley. YOLO-Z: Improving small object detection in YOLOv5 for autonomous vehicles. arXiv preprint arXiv:2112.11798, 2021.

Bohlin, Inka. Maltamo, Matti. Hedenås, Henrik. Lämås, Tomas. Dahlgren, Jonas. Mehtätalo, Lauri. ”Predicting bilberry and cowberry yields using airborne laser scanning and other auxiliary data combined with National Forest Inventory field plot data.” Forest Ecology and Management, Volume 502. ISSN 0378-1127, 2021.

Galanty, Agnieszka . Tomasz Danel, Michał We˛grzyn, Irma Podolak. ”Deep convolutional neural network for preliminary in-field classification of lichen species.” 2021.

GitHub.com. Flutter SDK. 2022. https://github.com/flutter/flutter (retrieved 7. 12 2022).

Gogul, I. and Kumar, Sathiesh. ”Flower species recognition system using convolution neural networks and transfer learning.” 2017.

Kilpeläinen, Harri. Miina, Jari. Store, Ron. Salo, Kauko Salo. Kurttila, Mikko. ”Evaluation of bilberry and cowberry yield models by comparing model predictions with field measurements from North Karelia, Finland.” Forest Ecology and Management, Volume 363. ISSN 0378-1127, 2016. 120-129.

Liu, Li & Ouyang, Wanli. Wang, Xiaogang. Fieguth, Paul. Chen, Jie. Liu, Xinwang. Pietikäinen, Matti. ”Deep Learning for Generic Object Detection: A Survey.” International Journal of Computer Vision. 2018.

marjahavainnot.fi. ”Luonnonmarjojen satohavainnot.” 2022. https://marjahavainnot.fi/assets/info/Havaintometsan_perustaminen_v2.pdf (retrieved 25. 4 2022).

Marmanis, Dimitris and Wegner, Jermaine and Galliani, Silvano and Schindler, Konrad and Datcu, Mihai and Stilla, Uwe. ”SEMANTIC SEGMENTATION OF AERIAL IMAGES WITH AN ENSEMBLE OF CNNS.” ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences. 2016. 473-480.

Redmon, Joseph. Divvala, Santosh Kumar. Girshick, Ross B. Farhadi, Ali. ”You Only Look Once: Unified, Real-Time Object Detection.” CoRR abs/1506.02640. 2015.

Sculley, D. and Holt, Gary and Golovin, Daniel and Davydov, Eugene and Phillips, Todd and Ebner, Dietmar and Chaudhary, Vinay and Young, Michael and Crespo, Jean-François and Dennison, Dan. ”Hidden Technical Debt in Machine Learning Systems.” Advances in Neural Information Processing Systems. 2015.

Solawetz, Jacob. YOLOv5 New Version – Improvements And Evaluation. 29. 6 2020. https://blog.roboflow.com/yolov5-improvements-and-evaluation/ (retrieved 25. 4 2022).

This article has been evaluated by the FrostBit’s publishing committee, which includes Heikki Konttaniemi, Toni Westerlund, Jarkko Piippo, Tuomas Valtanen, Pertti Rauhala and Tuuli Nivala. The article publications can be found on the FrostBit publication blog.

17.12.2022

SHARE IN

Mikko Pajula, Specialist

Mikko is a coder and part time teacher. His programming tasks are mainly full stack development with extra knowlegde in GIS and deep learning.

Bachelor of Engineering

Detecting berries with Deep Learning

BerryMachine is a project funded by Interreg Nord, the partners being Lapland University of Applied Sciences, Natural Resources Institute Finland (LUKE) and NIBIO (Norway). The purpose of the project is to research the possibilities of artificial intelligence in detecting the amount and quality of the current berry harvest, based on a photo taken by a smartphone camera. The artificial intelligence system prototype implementation will be developed by FrostBit Software Lab.

Traditionally berry harvest sighting data is produced manually by calculating berries within one square meter forest area at a time. These sightings are later used by certain calculations to determine the overall berry harvest situation. This method is considered to take excessive amounts of time, especially during the berry counting phase. Therefore, using artificial intelligence as a tool to speed up berry counting is a potential resource saving opportunity, which would also enable others than experts to count berries easily, which would improve overall efficiency.

The phases to implement the artificial intelligence system during the project include planning and executing the acquisition of the berry photo dataset, preprocessing the berry photo dataset as well as studying, testing and implementing suitable image recognition and object detection technologies into a one complete system based on the project dataset. Creating an artificial intelligence system is a naturally iterative process, which means, earlier phases have to be re-visited if the project requires it, excluding the acquisition of the project’s photo dataset.

Preliminary technological study, Case BerryMachine

The modern artificial intelligence is an extremely wide concept, which can be divided into three different main categories, from software architecture’s point of view. These categories are traditional machine learning, deep learning and reinforced learning (Krishnan 2019). Since the BerryMachine –project focuses on image and object detection, the most natural approach is to use deep learning technologies in the project.

Figure 1. A neural network, illustration.

By using deep learning, the artificial intelligence learns important features that a photo consists of independently. These features can be, for example, the shape or color of the recognized object, or even the typical background where the recognized object is usually attached to (for example, a car tire is typically attached to a car). The core concept of image and object detection is a neural network, which is practically a complex and a multi-layer information processing method, which is greatly similar to the way how humans learn new things. Basically a single layer of a neural network analyzes one feature of a photo at a time. For example, one layer could focus on the object’s shape, while some other layer could analyze the object’s color (Bonner 2019; Patel 2020; Géron 2019, 448). There are multiple types of neural networks (CNN, ANN, RNN). However, in this article, we will always refer to the CNN type of neural networks, which is also known as the convolutional neural network (Pai 2020).

The theory of neural networks goes back as far as the year 1943 (McCulloch & Pitts 1943), but it hasn’t been until the last couple of decades for the modern processing power to begin being powerful enough for the needs of applying neural networks in practical applications in the field of information technology.  

When it comes to image recognition, two main alternatives exist, the first being image classification and the second being object detection (Browniee 2021; Sharma 2019). Multiple approaches exist for both alternatives, and none of them is naturally better for all purposes, since it always depends on the situation. The most suitable approach and alternative that fit best the given purpose and dataset is typically found by boldly experimenting on different alternatives. (Leo 2020; Dwivedi 2020.) This is also the reason why developing image recognition systems take a considerable amount of development time. Because of this, any artificial intelligence project should have a flexible project structure to ensure the best possible outcome with the given resources.

Image classification and object detection are often used for similar purposes, however, they have different outcomes and implementations. From the outcome point of view, image classification aims to recognize what the photo depicts as a whole, while object detection aims to find every instance of the desired objects and their locations within the photo.  In other words, image classification recognizes that a photo contains flowers, while object detection also points out where all the recognized flowers are located within the photo.

Figure 2. The difference between image classification and object detection.

Common image classification technologies, neural network models and algorithms include VGG16/19, ResNet, MobileNet, Inception and Exception, just to name a few. Other viable alternatives exist also. Common object detection technologies include EfficientDet, YOLO and R-CNN. Which machine learning technology should be used in a given situation depends often on practicality related to software architecture, the compromise between detection accuracy and efficiency as well as the suitability of the technology for the project’s needs.

The practicality related to software architecture focuses on the compatibilities between technologies that only function correctly on certain software platforms and versions, which can make the software development process extremely slow and unnecessary difficult, if the compatibility issues are too frequent. The compromise between detection accuracy and efficiency refers to the process of selecting a recognition technology for the project that has the best possible accuracy detection, but it’s also light enough to be realistically used in the systems used by the project. For example, the most accurate and heaviest technologies might not be viable to be used in mobile devices, if the calculations required by the artificial intelligence is done within the mobile device itself.  In the BerryMachine –project, all heavy computations are calculated on servers, which allows us to use the heavier technologies if needed. The suitability for the project’s needs refers to how well the chosen technology performs with the project’s dataset in reality. For example, some certain pre-trained image recognition model (VGG19, ResNet etc.) could perform better in recognizing vehicles in photos than some other model. (Github 2021; Hofesmann 2021; Lendave 2021; Özgenel & Sorguç 2018.)

Producing the berry photo dataset and preprocessing

The technological development of the BerryMachine –project started in spring 2021 by designing the acquisition phase of the berry photo dataset. When it comes to image recognition, the photo dataset has to be considerably large. One classic definition states that each recognizable category should have at least 1000 different photos within the dataset. On the other hand, some others imply that it can be less in some situations. The photo dataset should also be versatile, so that the artificial intelligence learns to detect the desired objects against different backgrounds as well. (Warden 2017; Huellmann 2021.) We decided to acquire a photo dataset as large and versatile as possible, so that we would surely have enough material for the project’s needs.

Figure 3. Some photos within the BerryMachine berry photo dataset.


More than 10000 berry photos were collected during summer and autumn 2021 in the BerryMachine –project. A single observation square in this case was an aluminium frame sized one square meter, from within the berries were calculated. To process the photo dataset into the format understood by the artificial intelligence, the dataset had to be annotated first. Annotation is the process of marking down recognizable objects into each photo within the photo dataset in such a way, that the artificial intelligence can train itself to recognize the desired objects from new photos. In this case, the desired objects are raw and ripe berries as well as their flowers. The actual berry types found within the berry photo dataset are bilberry and lingonberry.

A software solution called Label Studio was used to annotate the berry photo dataset. Label Studio supports multi-user annotation process on same dataset simultaneously and also provides various different annotation data formats to be used with artificial intelligence training. Label Studio also supports using a previous self-trained machine vision model to help annotating photos automatically. In this case, the artificial intelligence finds the potential berries in the photo, and the user has to verify and fix the results, if needed.

Figure 4. The annotation view of Label Studio and accepting the results of automated annotations.

If using traditional image classification, annotation is not needed, because the photo dataset can be processed by the artificial intelligence based on separate folders. In this case, each folder represents a single recognizable category in the image classification dataset. Since we focused on object detection in the first version of the BerryMachine artificial intelligence system, we concentrated on annotating the photo dataset in this phase.


When a single photo is fed to the neural network process, the photo has to be resized into a certain size and shape. The typical image size in image classification is 224 x 224, which ensures that the photo processing time does not increase too much because of the image size. Since object detection involves mainly annotation data, we can use a greater image size by default. For example, the first version of the BerryMachine artificial intelligence system uses the YOLOv5l –object detection model, which supports photos either in the size of 640 x 640 or 1280 x 1280. The photos in the BerryMachine –photo dataset are typically larger than this, for example 2024 x 4032, which causes certain challenges in the development, since the sizes do not match with supported image sizes. If using a larger image size, it will make the training phase of the artificial intelligence considerably heavier as well.

Finally the processed photo is converted to tensor format (multidimensional matrix), which means the photo is converted to a numeric format. Since neural networks do not understand photos themselves, the numeric format is necessary so that the neural network is able to process the data. The numeric data within a tensor is often also normalized, so that each value in the data is between -1 and 1, which is optimal for training the neural network.

First technological prototype

After the photo dataset was in usable format, the next phase was to feed it to the artificial intelligence. This phase is called training the artificial intelligence, which will result in an artificial intelligence model that is able to detect those objects from a photo it has been trained to detect. In the case of BerryMachine, the artificial intelligence is trained to recognize raw and ripe berries as well as their flowers from a photo.

In order to get even the smallest photo details on board for the training phase, cropping the photo dataset was necessary. There were no ready-made automatic tools to do this cropping phase, which forced us to create our own tools. During the cropping phase, some challenges emerged, for example, berries that were in the middle of the cropping lines or close-up shots of berries that existed in multiple cropped pictures even if they were overlapping.  While cropping the photo dataset automatically, the annotation data was also split simultaneously, in order to keep the dataset synchronized.

Figure 5. Some usable image dataset cropping techniques.


After the photo dataset was cropped and splitted and the training software of the artificial intelligence was developed, the training software and the dataset was transferred to the Mahti –supercomputer at CSC, which was launched to be used by higher education institutes in Finland in 2020 (see CSC 2020 and Docs CSC 2021).

Training an artificial intelligence often takes a considerable amount of time, and the more complex the neural network and the larger the dataset is, the longer the training time and the more processing power the training phase requires. This is the reason why graphical processors (GPU) are usually preferred over traditional processors (CPU) when training a neural network. The reason for this is the processor architecture of a GPU, which is more efficient in processing specialized calculations in parallel when compared to CPUs. This is especially useful when training neural networks (Sharabok 2020).

For the first version of the BerryMachine’s artifical intelligence system, YOLOv5l –object detection technology was chosen as the main technology. YOLOv5l was chosen mostly due the practicality and detection accuracy. YOLOv5l is programmed by using the popular PyTorch –module (Python), which makes the implementation phase far easier. YOLOv5l is also very competetive in its detection accuracy, even if compared to other common object detection technologies.

During the training phase of the artificial intelligence, the amount of available graphics card memory affects considerably the training efficiency. The Mahti –supercomputer at CSC has 40 Gb of graphics memory in total. While training the BerryMachine –artificial intelligence, even this amount of memory proved difficult, since if we tried to use the larger image size of 1280 x 1280, the memory of a single graphics card ran out. One way to solve this challenge is to revert back to the image size of 640 x 640. However, this requires a longer training phase since the photo dataset has to be cropped into smaller pieces in order to achieve the same accuracy.

In the end, it took approximately 30 hours to train the latest version of the artificial intelligence on the Mahti supercomputer. After the training phase, the next step was to evaluate the results of the training process and the detection results.

Evaluating the first results and planning for the future

The first version of the BerryMachine artificial intelligence system aimed to recognize the following categories from a given photo:

  • bilberry (flower, raw, ripe)
  • lingonberry (flower, raw, ripe)
  • lingonberry bunch (flower, raw, ripe)

We decided to create a separate category for the lingonberry bunch. The reason for this is, based on our own experiences, it seems the artificial intelligence is more adept in recognizing a bunch than all the lingonberries separately within a bunch. This happens most likely due to the way how lingonberries grow in nature when compared, for example, to bilberries.

Figure 6. Lingonberries and berry detections.

The detection results produced by the trained artificial intelligence system with different photos:

Figure 7. The trained artificial intelligence system recognizes berries from a photo. The decimal number after the recognized category is the confidence of detection in percentage format (e.g. 0.7 = 70% confidence of correct detection).

Evaluating an image classification and/or object detection model can be done by using a variety of tools. One of the most common tools is the so-called confusion matrix, which allows us to quickly see how well the trained artificial intelligence model is able to recognize objects correctly and incorrectly. The confusion matrix created by the first version of the BerryMachine artificial intelligence is the following:

Figure 8. The confusion matrix produce by the first version of BerryMachine artificial intelligence system. The numbers on the diagonal line are the correct detections within the test dataset, everything else are false detections. The number values are percentages in decimal format (e.g. 0.72 = 72%).

The challenges with the lingonberry detection can be seen in the figure above. The background of the photos also is problematic, especially when detecting lingonberries. Even if all of the accuracies could be greatly improved, it’s especially the lingonberry that provides most of the difficulties at the time.

The next questions therefore is: why does this happen, and what can we do to improve the artificial intelligence further? The first step is to examine the dataset itself:

Figure 9. The distribution of berry photos by berry type in the dataset.

Figure 10.The distribution of berry photos by berry type and ripeness level in the dataset.

Figure 11. The dataset divided by ripeness levels and growing phases.

Figure 12. The dataset divided by growing phases and berry types, including lingonberry bunches.

By examining the figures above, we can see the imbalances within the photo dataset. First it seems the amount of bilberry photos is too small, but if examining especially the amount of annotations, the greatest deficit is within the amount of lingonberry photos. One approach to improve the dataset would to obtain more material for this category, and then retrain the artificial intelligence again.

We can also notice the following:

  • Lingonberries have more annotations than bilberries as a whole
  • Amount of annotations based on ripeness level are balanced within all photos
  • Amount of annotations are balanced between growing phases within a single berry type
  • The amount of annotations is considerably lower in raw lingonberries and lingonberry bunches when comparing to other categories. This correlates with the other difficulties considering the detection of lingonberries (see the earlier confusion matrix), but doesn’t most likely explain all the difficulties within lingonberry detection

In addition to complementing the dataset, we can also consider other methods to improve the artificial intelligence system. These include:

  • Replacing the currently used object detection technology completely
  • Using a heavier object detection model, for example, YOLOv5x
  • Creating more additional artificial intelligence tools to further process lingonberry detections in cases, where the artificial intelligence is not confident enough of the correct detection
  • Using image classification technology to support the object detection technology

Generally it seems we are not going to achieve a satisfying detection accuracy solely relying on object detection technologies. Because of this, FrostBit is going to combine object detection with conventional image classification next, to further process problematic detection cases.

Even if we can’t provide exact numbers, we can provide an educated guess on typical detection accuracies based on different image detection technologies. Based on our own experiences, the typical detection accuracy in object detection is between 40-70%, while the typical detection accuracy in image classification is typically between 80-95% per single photo. The limitation in image classification is that they only analyze the photo as a whole, for example, whether the photo has a raw lingonberry or a ripe bilberry etc.

Since image classification can only detect what the photo depicts as a whole, it’s not effective in recognizing multiple berries from a single photo at once. Because of this, the next version of the BerryMachine –project’s artificial intelligence system will use object detection only to find berries on the general level within a photo, all of which are cropped into separate smaller photos, which will finally be processed by image classification technologies. Therefore, the general software architecture can be visualized in the following way:

Figure 13. The next version of the BerryMachine artificial intelligence system, software architecture design.

Creating any kind of artificial intelligence system can potentially take an infinite amount of development time, the only limitations being creativity and available processing power. Because of this, FrostBit aims to get as much as possible out of the artificial intelligence technologies within given resources during the project. After the BerryMachine –project, it will be interesting to see how far did we get with the developed berry detection system and what can we create next based on our previous findings. Every development iteration seems to take us closer to our original objective.

The BerryMachine –project is being funded by Interreg Nord 2014-2020. The total budget of the project is 144 431 euros, of which 122 766 euros is being funded by the EU. The project schedule is 1.1.2021 – 30.9.2022.

This article has been evaluated by the FrostBit’s publishing committee, which includes Heikki Konttaniemi, Toni Westerlund, Jarkko Piippo, Tuomas Valtanen, Pertti Rauhala and Tuuli Nivala. The article publications can be found on the FrostBit publication blog.

References

Bonner, A. 2019. The Complete Beginner’s Guide to Deep Learning: Convolutional Neural Networks and Image Classification. Towards Data Science 2.2.2019. Accessed on 17.12.2021 https://towardsdatascience.com/wtf-is-image-classification-8e78a8235acb

Browniee, J. 2021. A Gentle Introduction to Object Recognition With Deep Learning. Machine Learning Mastery 27.1.2019. Accessed on 17.12.2021 https://machinelearningmastery.com/object-recognition-with-deep-learning/

CSC 2021. Supercomputer Mahti is now available to researchers and students – Finland’s next generation computing and data management environment is complete. CSC 26.8.2020. Accessed on 17.12.2021 https://www.csc.fi/en/-/supercomputer-mahti-is-now-available-to-researchers-and-students

Docs CSC 2021. Technical details about Mahti. Docs CSC 14.4.2021. Accessed on 17.12.2021 https://docs.csc.fi/computing/systems-mahti/

Dwivedi, P. 2020. YOLOv5 compared to Faster RCNN. Who wins?. Towards Data Science 30.1.2020.  Accessed on 17.2.2021 https://towardsdatascience.com/yolov5-compared-to-faster-rcnn-who-wins-a771cd6c9fb4

Géron, A. 2019. Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow. 2nd edition. Sebastobol: O’Reilly Media

Github.com 2021. TensorFlow 2 Detection Model Zoo.  Github.com 7.5.2021. Accessed on 17.12.2021 https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md

Hofesmann, E. 2021. Guide to Conda for TensorFlow and PyTorch. Towards Data Science 11.1.2021. Accessed on 17.12.2021
https://towardsdatascience.com/guide-to-conda-for-tensorflow-and-pytorch-db69585e32b8

Huellmann, T. 2021. How to build a dataset for image classification. Levity 9.11.2021. Accessed on 17.12.2021 https://levity.ai/blog/create-image-classification-dataset

Krishnan, B. P. 2019. Machine learning Vs Deep learning Vs Reinforcement learning. Medium 18.9.2019. Accessed on 16.12.2021 https://medium.com/analytics-vidhya/machinelearning-deeplearning-reinforcementlearning-ed7b217861c5

Lendave, V. 2021. A Comparison of 4 Popular Transfer Learning Models. Analytics India Magazine 1.9.2021. Accessed on 17.12.2021 https://analyticsindiamag.com/a-comparison-of-4-popular-transfer-learning-models/

Leo, M. S. 2020. How to Choose the Best Keras Pre-Trained Model for Image Classification. Towards Data Science 15.11.2020. Accessed on 17.12.2021 https://towardsdatascience.com/how-to-choose-the-best-keras-pre-trained-model-for-image-classification-b850ca4428d4

McCulloch, W.S. & Pitts, W. 1943. A Logical Calculus of Ideas Immanent in Nervous Activity. Bulletin of Mathematical Biophysics 5, 1943, 115–133. https://doi.org/10.1007/BF02478259

Pai, A. 2020. CNN vs. RNN vs. ANN – Analyzing 3 Types of Neural Networks in Deep Learning. Analytics Vidhya 17.2.2020. Accessed on 17.12.2021 https://www.analyticsvidhya.com/blog/2020/02/cnn-vs-rnn-vs-mlp-analyzing-3-types-of-neural-networks-in-deep-learning/

Patel, K. 2020. Image Feature Extraction: Traditional and Deep Learning Techniques. Towards Data Science 9.9.2020. Accessed on 17.12.2021 https://towardsdatascience.com/image-feature-extraction-traditional-and-deep-learning-techniques-ccc059195d04

Sharabok, G. 2020. Why Deep Learning Uses GPUs? Towards Data Science 26.7.2020. Accessed on 17.12.2021 https://towardsdatascience.com/why-deep-learning-uses-gpus-c61b399e93a0

Sharma, P. 2019. Image Classification vs. Object Detection vs. Image Segmentation. Analytics Vidhya 21.8.2019. Accessed on 17.12.2021 https://medium.com/analytics-vidhya/image-classification-vs-object-detection-vs-image-segmentation-f36db85fe81

Warden, P. 2017. How many images do you need to train a neural network? Pete Warden’s Blog 14.12.2017. Accessed on 17.12.2021 https://petewarden.com/2017/12/14/how-many-images-do-you-need-to-train-a-neural-network/

Özgenel, Ç.F. & Sorguç, A. G. 2018. Performance Comparison of Pretrained Convolutional Neural Networks on Crack Detection in Buildings. Berlin. 35th International Symposium on Automation and Robotics in Construction (ISARC 2018). Accessed on 17.12.2021 https://www.iaarc.org/publications/fulltext/ISARC2018-Paper154.pdf

30.12.2021

SHARE IN

Tuomas Valtanen, Project Manager

Tuomas works as the web/mobile team leader, a software engineer and a part-time teacher in FrostBit at Lapland UAS. His tasks include project management, project planning as well as software engineering and expertise in web, mobile and AI applications.

Master of Engineering

Mikko Pajula, Specialist

Mikko is a coder and part time teacher. His programming tasks are mainly full stack development with extra knowlegde in GIS and deep learning.

Bachelor of Engineering

Creating a Digital Twin VR-Environment

DUKE is a collaboration project between Lapland University of Applied Sciences (Lapland UAS) and Rovaniemi Municipal Federation of Education (REDU). The idea of DUKE is to examine how VR solutions can be implemented in educational purposes while leveraging renewable energy. Our main pilot is a digital twin of an educational district heating plant located in Rovaniemi and our secondary pilot is a digital twin of a customer grade heat pump.

Digital twins allow us to visualize the environment with great immersion, this enables us to implement accurate functionality to various devices and systems. High immersion helps users to subconsciously connect the virtual object to the real-life object, thus seeing a virtual lever would have the same relationship as seeing a real lever; it should be interactable.

DUKE’s customer grade heat pump was made in cooperation with a project partner. Our partner wanted to incorporate VR solutions alongside their user manual. This way, the user can visualize more accurately basic procedures and maintenance work for the heat pump. Our DUKE engineers tackled this project by working closely with the heat pump specialists. The specialists taught us how to do maintenance on the heat pump and we, in turn, translated that knowledge into a VR simulation. The development process fundamentally revolved around an iterative workflow, where we had assessment and review meetings back and forth. In doing so, we ensure that the 3D-recreation was accurate and at a satisfactory quality.

Creating the digital twin VR-environment

A digital twin is a 3D virtual recreation of an object or a real-world environment. The definition ‘digital twin’ can be perceived as an instruction regarding how closely or realistically the virtual recreation represents its real-world counterpart. Here in Frostbit Software Lab, we aim to create our digital twins to the extent where the 3D-recreation is highly accurate and unmistakable when comparing the 3D-version to its reference.

Side-by-side image of the rendered 3D-scene and the raw 3D-scene.

The workflow and subsequently the challenges we face depends largely on the use case scenario. The VR simulation hosts a physics simulation system in addition to the high-fidelity 3D-visual representation of the environment. For this reason, we chose to use Unity for its lightweight and adaptable nature to implement our real time VR simulation. Unity allows us to easily manipulate and optimize the VR simulation to work with customer grade computers and high-end computers for a higher fidelity experience.

The developer’s viewport inside Unity game-engine. The screenshot is taken from the work-in-progress district heating plant.

We kicked off the visual implementation process by taking reference images on-site. The images are captured from multiple angles in order to get an understanding of the object’s shape and contour. Moreover, it is exceedingly useful in figuring out the object’s material properties, such as how shiny it is, what type of metal it is and what kind of reflections it produces.

A reference image of the district heating plant.

The 3D-modeling process uses blueprints in tandem with the reference images to get accurate measurements for a high-detail recreation. A keen observer may still find some discrepancies when comparing the screenshot image to the reference image. This is due to readjustments, some parts of the district heating plant were deemed obsolete for the VR simulation and therefore had to be cut or modified.

The texturing process uses the multi-angle images to figure out what type of material would be the most accurate to its real-world version. We use HDRi-images to create realistic lighting to the 3D-scene. HDRi’s are essentially 360-degree panoramic images that contain a vast amount of lighting data. Ultimately, we are able to create believable environments by following real-world lighting conditions while incorporating physical based rendering techniques.

A screenshot of the disctrict heating plant digital twin.

VR-Implementation in a game-engine

The first step our developers need to make is to choose an ecosystem, in other words, which development pipeline to use. Locking into a development pipeline is essential in guaranteeing efficient feature implementation down the line. The ecosystem in this context consists of choosing which manufacturers VR-gear to use, what game-engine and development toolkit to use. Our DUKE engineers ended up on using Unity game-engine for its lightweight and adaptable nature.

A test-player trying out the playable demo of the DUKE VR-simulation

Our DUKE engineers chose to use SteamVR software development kit for its wide VR-gear support. This SDK provides accessible script blueprints, that we can tailor for our own usage. For example, the SDK provides a basic functionality for object interaction and player movement. This way, more time can be allocated to polish more advanced features such as the dynamic interaction between different objects while a mathematic simulation is running in the background.

The mathematic simulation is built with modularity in mind. Each part of the simulation is broken down to individual pieces. This grants us the customizability to assign the simulation scripts to the 3D-objects that would logically handle the same physics in the real world. For example, the scripts that handle pressure loss are assigned to the 3D-objects that look like valves or pipes. One of the challenges we faced when working with a VR solution was that real-life dimensions did not translate well in a VR environment. EU-standard sized doors seemed too narrow, standardized knobs, bolts and valves seemed too small and ultimately play-testers reported having a difficult time maneuvering in the VR environment overall. A silver lining needed to be struck in order to satisfy the digital twin -nature of the project and the ease-of-use side for the players.

The current DUKE heat pump build contains four scenarios: “Tutorial”, “Water filter”, “Pressure adjustment” and “Water leakage”. Each scenario provides information on how to operate the customer grade heat pump and how to perform basic maintenance.

Tutorial -scenario

Tutorial scenario explains thoroughly all the functions available in the project. This scenario also includes information covering the main parts of the heating system. This is usually the first scenario the player should launch, since it also teaches the player how to use the VR-gear.

Water filter -scenario

The user can learn how to clean the water filter using this scenario.

Pressure adjustment -scenario

The user can learn how to adjust the heat pump to a correct pressure level in the system and how to use the safety valve to release the pressure from the system.

Water leakage -scenario

Water leakage -scenario provides a good guideline on how to store the heat-pump system. We show the most common issues that can be encountered in such a space.

The total budget of the project is 761 732 euros, of which Regional Council of Lapland has granted European Regional Development Fund (ERDF) and state funding 609 385 euros. This project will be executed in the timeframe of 1.1.2020-31.12.2022.

This article has been evaluated by the FrostBit’s publishing committee, which includes Heikki Konttaniemi, Toni Westerlund, Jarkko Piippo, Tuomas Valtanen, Pertti Rauhala and Tuuli Nivala. The article publications can be found on the FrostBit publication blog.

20.10.2021

SHARE IN

Onni Li, Specialist

Onni specializes in CG-Graphics In Frostbit Software Lab. The majority of his responsibilities revolve around creating 3D-models, virtual environments and procedural materials for texturing. Onni is proficient with various different game and render engines by understanding how physically based rendering techniques work.

Bachelor of Engineering

Severi Kangas, Specialist

Severi worked as a programmer in FrostBit at the DUKE-project.

Bachelor of Engineering

A new era of educational games – learning through digital solutions and gamification

Advances in technology have made it possible to create more advanced games for educational purposes. It is no longer just “point and click” style computer games for children in the late 90s, but developments have given us the keys to assist in more complex learning. The wildest dreams of digital twins that utilize virtual reality have already come true and being constantly developed in the FrostBit laboratory. However, for the consumer, the virtual world viewed through glasses can still be foreign.

At the same time as the spearhead of technological development is far ahead of the consumer, we have also reached a different milestone in the world of educational games: each of us has access to a digital terminal that opens the door to learning in increasingly diverse ways. In practice, therefore, everyone can use educational games in their daily learning via a smartphone or computer. The quality and diverse content of games have also improved in the same proportion, and in recent years various device manufacturers also seem to be slowly waking up to the opportunities brought by digitalization. Virtual reality will reach us all at one point, but at this point we must at least leave behind the old moldy cereal-pack games.

Virtual learning in healthcare

In co-operation with the social and healthcare section, we are currently implementing a simple project as part of the Nursing RDI activity, the aim of which is to create a tutorial that teaches how to use an infusion pump. This Braun Infusomat Space is an everyday tool for the nurse and learning how to use it is one small but important part of nursing studies. Traditionally, the infusion pump is learned to be used in the classroom without the patient and in theory it is possible for the student to go through the operation of the device as much as they wish. In practice, however, there are often many students, limited time, and personal learning difficulties that can get in the way of learning, and because of these aspects, traditional classroom instruction does not always guarantee that all students will be able to use the device enough. A nursing student focuses mainly in learning to take care of the patient, and the use of technical equipment is not necessarily their primary focus. Infusomat is a reasonably valuable and even a little “threatening” device, and when practicing it, the student often learns certain procedures and tries to avoid mistakes. Thus, learning to use a device does not actually involve deep learning, but mainly the ability to repeat what has been learned. Such learning procedures lead to the expected result, but as an unfortunate side effect, the student does not actually get to know the device but merely uses it in predetermined situations. In case of problem situations or in the event of a disturbance, the ability to react and act on the situation is not the best, and the student may not feel confident when using the device.

The functions of an infusion pump (alpha)

Gamified learning can enhance the students’ deep learning. Instead of learning how to use an infusion pump only in a controlled way in the classroom, the student can continue to “play with the device” in the virtual world. Once the operation of the entire device has been 3D-modelled accurately in the game and the interface is suitably simple, the player will be able to try out different functions and resolve problem situations without worrying about accidentally breaking the pump. By utilizing the game, the student achieves confidence in using the device and then the operation of the correct infusion pump feels much more familiar as well as easier to learn in practice.

How are the functions of the infusion pump transferred to the game?

The most important aspects of making this game are the following:

  • User interface
  • Device operation
  • Correspondence of the 3D model
The 3D-model of the infusion pump (alpha)

What is essential, of course, is that the developer not only thoroughly understands the operation of the device, but also models it with the accuracy required for its operation. A big factor in particular is user orientation: a game can be useless to its target audience if the target audience is unable to use it. The developer must assume that the user of the device has little or no experience in playing different games and therefore must understand to develop functionalities so that no user experience difficulties will alter the learning process. In practice, this means that features very familiar to many digital games, such as first-person character movement and camera control, must be omitted or made so simple that they don’t interfere with the gaming experience. It is unnecessary to force the player to learn the seemingly complicated interface that they even get the opportunity to learn how to use the device himself.

The development phase has two stages: device modelling and construction of the game mechanics. Blender-3D Software is used to model the infusion pump and game mechanics are created with the Unity game engine. In modelling the device, it is essential to model all those external features that matter either in terms of game mechanics (buttons, screens, power cords) or visual equivalence (round arc of the door, aspect ratio). In game mechanics, an accurate representation of the infusion pump operation is important. For example, when you press the start button on the device, the infusion pump emits a small beep, the lights turn on, and then a text appears on the display to indicate that it has started. All of those functions are essential to make it easy for a player to transfer what they have learned from the digital to their real world. However, it is not essential that the internal operating logic should be exactly the same as that of the device being modelled.

A real infusion pump

The end result can be considered as the crucial point: the game must be able to convince the player that it works in the same way as the real-world device it represents. The choices made in game development must therefore be viewed from the perspective of transferring what has been learned to the real world: if the function is important for using the right physical device, then it must also be implemented in the digital version. Thus, a game made of an infusion pump does not have to be a perfect digital twin, but mainly a model that simulates its operation as accurately as possible.

The project is still ongoing, but looks very promising. Combining game development with the development of learning outcomes is an interesting and important direction that will certainly provide interesting data as the game develops. The tutorial is scheduled to be ready in June 2021. The game will run on a regular computer, mobile device, and in addition, the possibility of virtual reality will also be explored.

‘TKI-toiminta digiaikaan’ -project is funded by Lapin Liitto with EAKR-funding.

29.04.2021

SHARE IN

Samuli Valkama, Specialist

Samuli Valkama works as a specialist in the FrostBit software laboratory at the University of Lapland. The focus of his work is on project and game design, but the increase in the number of video productions in particular has increased.

Bachelor of Arts (Media), University of Lapland

Historical accuracy versus playability – Designing the Struve Geodetic Arc mobile game

If a game, game-like learning environment or any other software using game technologies is based on real events, the relationship between realism and playability has to be taken into consideration during the development process. Imitating the real world precisely with all its limitations set by physics, biology or the passage of time doesn’t necessarily serve the original purpose or goal of the game or software. It doesn’t matter if the goal is to entertain or to educate, both can suffer from total realistic accuracy.

Making a game replicate its real-life counterpart as precisely as possible is sensible and sometimes absolutely necessary when a game is focused on a very limited subject, such as controlling a specific vehicle. In the case of vehicle simulators, the physical controllers are also often made to look and feel like the real thing. On the other hand, when the subject of the game is broader, decisions have to be made on which things to replicate and which things to change to enable e.g. playability, entertainment or better understanding of the bigger picture. The intended platform of the game also plays into these decisions. A keyboard-mouse combination, regular commercial game controller or a smartphone screen are all very generic controllers and as such can’t exactly replicate the feel and functions of any specific tools or items. If the chosen platform is mobile, physical size and performance of the devices as well as the inaccuracy of gesture controls set extra limitations.

This balancing between realism, hardware limitations and the goal of the game has been and will continue to be an integral part of the design process of Struve Geodetic Arc mobile game. The game is being produced as a part of The Northern parts of the World Heritage Struve Geodetic Arc project (struvenorth.net), partially financed by EU’s Interreg Nord programme (ERDF).

The background story: What is the Struve Geodetic Arc?

Friedrich Georg Wilhelm von Struve (1793-1864) was an astronomer who also had an interest in geodetic surveying.  As a part of his research he organized a triangulation survey with a triangle chain reaching from Black Sea to Hammerfest. The goal of this survey was to gain a better understanding of the shape of the Earth near its poles. The whole chain includes 258 main triangles and 265 main station points that stretch over 10 different countries. A total of 34 of the main station points are part of UNESCO World Heritage. (maanmittauslaitos.fi)

Struve Geodetic Arc was accepted into the UNESCO World Heritage list in 2005, but it isn’t particularly well known in the Nordics. The Northern parts of the World Heritage Struve Geodetic Arc project aims to improve the accessibility and knowledge of the Struve Geodetic Arc. The purpose of the mobile game is to attract new audience for the World Heritage site by means of entertainment.

The events and environment of the game are based on real events in what is currently Finland, Sweden and Norway, then Sweden-Norway and the Russian Empire, during the 19th century. However, the intent is not to make a “real” educational game or replicate historical events and circumstances as precisely as possible, even though it would be possible in theory. Struve Geodetic Arc mobile game has been intentionally designed to handle its real-life backstory with a lighthearted and entertaining touch. The reasons behind these decisions can roughly be divided into three parts: reaching casual gamers, creating a better flow for the story and limitations of the mobile platform. These reasons and historical accuracy aren’t necessarily mutually exclusive, but in this project the choice was made to favor entertainment and casual approach.

Handling time and order of events

When a several hundred pages long novel is adapted into a movie, the events of the story are shortened and perhaps rearranged to some degree, in order to make all the important things fit in a more compact format and to keep the viewer engaged for the entire duration of the movie. Time and order of events have been approached somewhat similarly in the development process of the Struve Geodetic Arc mobile game for the exact same reasons: limiting the total length of the game and making it easy for the players to follow the events and keep them interested in the game from start to finish.

In reality, measurements across the entire triangle chain spanned several decades. The focus of the game is mainly on the section going over Lapland where measurements also took years. Exactly replicating all events from a time period this long would inevitably make the game longer despite the way time is handled inside the game. A typical mobile game session isn’t very long: GameAnalytics, a company selling tools for analyzing game usage, states in their 2019 report that the average session length for a mobile adventure game was less than 15 minutes (GameAnalytics Mobile Gaming Benchmarks Report: H1 2019, p.13). If the total length of a game is measured in several dozen hours like in some current AAA-titles for PC or console, it would take a much larger amount of these average length sessions to finish the game compared to a situation where the total length is just a few hours. In the Struve Geodetic Arc mobile game one of the goals is to get the players to finish the game and go all the way to the northern endpoint of the triangle chain in Hammerfest. So, if the game in total is very long and following that the required amount of game sessions needed to finish the game is large, it is possible the players feel that the game is too long and won’t ever finish it. Keeping the players engaged until the end of the game is a good enough reason to give up accuracy.

It was never intended that the actual historical timeline would be completely removed from the game, however. At the current development phase (spring 2021) the game is being designed to have calendar years pass according to the game’s internal time and to give the player a goal to reach the endpoint the same year the actual measuring crew got there. Some known places of stay might also be matched with the correct years.

Much like handling time inside the game flexibly, it was decided that the order of events would be presented in a more linear fashion compared to reality. In the game the player’s party follows a route along the triangle chain in one direction, whereas in reality the measuring crew did not move linearly from point A to point B. During the measurements the crew stayed longer in some places than others and measurements in a certain area could be done as separate trips from one place of stay. In the early stages of design, it would have been possible to choose game mechanics that supported moving in a way more closely resembling the actual measurements, but the development team decided to make movement on the map more linear. The team felt it would be more motivating for the player to follow a clear route from one place to another compared to e.g. separate trips from one place and returning there every time. One factor in this decision was that in the earliest design stages the team didn’t have the information on the more exact routes, it became available at a later phase. Lacking this information, it was easier to design the game to have a linear route.

Handling time and events in a flexible manner sacrifices much of the historical accuracy. The reasoning behind this sacrifice is making the game more approachable. No background information or prior interest towards the subject of the game is required from the player. The game also doesn’t aim at educating players by force. The thought is first and foremost to offer a good experience. Then on the side tell the player about a subject that was possibly previously unknown to them and get them interested enough to find out more through other means.

Effects of mobile platform

Regardless of genre and content of the game, limitations set by the platform must be taken into consideration when developing a game for mobile devices. Even though the physical size and resolution of screens on mobile devices have been growing, the screens are still many times smaller compared to a typical laptop or desktop screen. This sets limitations to the number of menus or other elements that can be fit on the screen at one time while still maintaining legibility. In addition to this, mobile devices are typically controlled with finger input, which is considerably less accurate way of navigation compared to a game controller or a mouse. This in turn limits the smallest possible size e.g. buttons can be.

When better accuracy is required, mobile devices can be controlled with a stylus. A small number of stylus-controlled games have also been made. Controllers made specifically for mobile gaming are also being made and sold, these closely resemble console controllers. Both styluses and controllers are mostly sold separately, so when the aim is to reach as large audience as possible and make using the game easy, it is not reasonable to require the user to have any kind of special controller devices.

In terms of game design, screen size affects (among other things) the amount of information the player can be shown at one time. This amount can be regulated for example by showing only the information that is required in that precise menu or at that precise moment of the game. Another option is dividing the information into smaller sections and having separate menus or menu levels for each section. Neither of these options are without problems. If the total amount of the information the player needs is large and most of it is contextual and not shown to player at all times, the limits of short-term memory might prevent the player from keeping all required things in mind. Then again, multi-leveled menus can be hard to navigate. This means the only solution isn’t just dividing and categorizing information or changing the way it is presented. The game itself must be designed to be simple enough so that the amount of required information isn’t too great.

Poor accuracy of finger input means that the basic functions of the game can’t require precise or complex actions from the player. The controls of the game must be simplified to a few most important buttons and simple gestures so that the learning curve isn’t too steep and continued use is effortless.

The need for simplification means more adaptations and loss of historical accuracy. Actions that in reality are complex or demanding have to be made easier and more straightforward. In the case of Struve Geodetic Arc mobile game one example of this is the triangle measurement itself. The aim is to give the player factual information on the measuring process, but using the measurement devices required skill and precision and the actions won’t be precisely replicated in-game. The realistic measurement actions will be replaced with simplified versions or with an entirely another set of related activities. When the basics of the game are easy to learn and playing doesn’t require complex actions, more casual players or people who haven’t really played at all, can easily start playing Struve Geodetic Arc game.

What historical things are included in the game?

Despite putting game mechanics and the requirements set by the chosen platform first in the development process of Struve Geodetic Arc mobile game, the intention is not to make a generic adventure game and just have it wear a cosmetic skin with some historical influences. The project team has gathered information on e.g. the progression of the actual measurements, known places of stay and the day-to-day -life of people in the 19th century. The game development team has then incorporated this information into their designs. The game will have player to encounter problems similar to those the real measuring crew encountered. The remoteness and challenging weather conditions of Lapland affected e.g. the ways measuring equipment could be transported and when measurements could be performed in the first place. One example of these conditions is that during the 19th century the actual road network in Finnish Lapland, excluding winter roads, reached Kittilä, Kolari and Sodankylä, so when the player moves further north they can’t choose to use a road and have to advance by other means.

One goal of the game is also to have the player understand, how significant the measurements were to the society at the time. Using the information gathered from the measurements, new more correct maps could be drawn and the significance of good maps to trade and warfare for example was and still is great.

Visual elements are one of the ways used to convey information and feeling of the time period to the player. Even though the art style of the game is somewhat comic-like, the appearance of e.g. clothes, tools, buildings and measuring equipment is modelled after their real-life counterparts. The intent is to show events and life from nearly 200 years ago as lively and interesting despite deciding to give up true historical accuracy in many ways.

More on The Northern parts of the World Heritage Struve Geodetic Arc project: struvenorth.net

More on the Struve Geodetic Arc in general: https://www.maanmittauslaitos.fi/en/struvegeodeticarc

Game Analytics report of 2019: https://progamedev.net/wp-content/uploads/2019/07/Benchmarks2019.pdf

31.03.2021

SHARE IN

Sanni Mustonen, Project Manager

Sanni is a graphic designer with an interest in the technical side of things. Her work includes everything from UI design to visual identities and teaching in the Summer Game Studies.

Doctoral Student, University of Lapland’s Faculty of Art and Design

UX and Service Design in the FrostBit lab

Besides being an intern in the FrostBit Software Lab, I am also a PHD student major in service design in the University of Lapland. I am very honoured to be an intern student in FrostBit. I got a chance to apply to UX and service design in different projects, and I am happy to tell you about my experiences.

What is UX and service design?

Recently, UX design and service design have been very topical: you maybe have heard about many products and software applied with UX design and service design principles. However, what exactly is UX design and service design?

UX means User Experience. Normally, in terms of design software, UX designers focus more on the logic in software. Through UX design, users will not feel confused or get lost in the 2D (applications, website and etc.) or 3D (VR or AR technologies) worlds. In most situations UX design links closely together with UI design (User Interface design).

Until now, there is still no single official definition to point out what is service design. From my own understanding, the key point in service design is “empathy”, which means that service designer must think of the “service process” from all stakeholders’ perspectives. For example, if you are developing a B2C websites, customers and staff who used the websites, as well as even some third parties are included as stakeholders. Furthermore, service design involves the whole process given the process design solution. Let me demonstrate you an example with a restaurant: when you first time see the logo and face of the restaurant, the service design already begins. The dining environment, taste of food, personal service and even the feedback you give yourself – all of these can be part of service design processes.

From my perspective, service design is more like a strategic design: in addition to the product itself, service design provides a more efficient and valuable design process solving through the “pain points”. Service design has a broader scope, which does not only include UX design, but also graphic design, product design, information design and so on.

Below, you can see a basic design process that I always follow:

How did I end up in UX and service design in the FrostBit Software Lab?

During my intern period in FrostBit Software Lab, there are two main things I have done: one is re-designing the FrostBit websites and the other one is doing interior design to the lobby outside the lab office. You could think that the website design is connected to the UX design and service design – which is true. In my design process, I implemented animations for the website, which could catch customers’ glance immediately when they arrive at the website, and push them to pay attention on the important information on the site. Moreover, some icons are combined with texts: in this way, the info could be transferred to readers in a very short time. Although there are common tips designers always apply in the web design, what I want to talk about is the way of thinking: thinking for readers and information provider at the same time. When you do it this way, you can define the core of design solution, balance the requirements from both two sides, and understand what is the “empathy ” I mentioned before.

The interior design for the lobby has the same thinking process: the process is determining the target group as well as determining the aim and main functions as well as the design elements (for each group) in the lobby. From my experience, I used UX and service design to plan the “big picture” of the lobby, and then use the interior design as a method to apply and show my ideas and design solutions.

During my internship, I did not only increase my basic design skills, but also practiced the key points of service design in the FrostBit software lab. In my opinion, there are three certain key things that you need to keep in mind during design process:

  • How to increase the value
  • How to use co-design
  • What is the new idea or solution

What about the future of service and UX design in FrostBit?

I am definitely sure the service design and UX design have a bright future in the FrostBit Software Lab. No matter how the world is changing, every progress that Frostbit software made could change or instruct people’s behaviour. This is why we use service design to see the whole picture and give “correct ” directions for our projects. At FrostBit, we are not only applying the “real” design skills, but also the way of “empathy” thinking in our projects and we are continuously working towards ways to establish this. If you want to see how we use UX and service design in our project to create inspiring outcomes, stay tuned for updates on our websites!

10.03.2021

SHARE IN

Nan Li, Specialist

Nan Li works as a specialist in the FrostBit Lab in LapUAS. She is a doctoral candidate major in Service Design in University of Lapland. Service design and graphic design are her main work areas.

Doctoral Candidate of Service Design, University of Lapland

Game Dev team working in FrostBit

We already have about 40 employees working in the lab, as well as, of course, our trainees, exchange students and students working on their thesis. Much of this is happening in the Game Development team amongst various game projects. Our number of employees has grown tremendously in recent years and a lot of new different skills have come along. We are no longer just a team of engineers coding in a dark lab with the power of ‘Jolt Cola’. Professionals in 3D-modelling, graphic design, service design, audiovisual and game pedagogy, among others, have joined the team. This has brought along everyone’s personal skills to the team. Our team has really strong skills to establish holistic projects from the beginning to the end. We constantly work closely together with projects on the Mobile and Web team, since almost every one of our projects needs at least a website or back-end system. Building a complex background system is the core competency of the lab’s Mobile and Web team.

Getting to know the mining processes in the Mantovaara open mine

What do we actually do in the Game Dev team? Our main tools are different game engines: we mainly use Unity3D or Unreal game engine according to the project. While our team name is Game Dev Team, we do not only make games in the “traditional format”. We utilize game technology, for example, for visualizations, simulations, learning environments and marketing. This makes the work really versatile; we get to work with different substance industries and we have made various implementations utilizing game technology ranging from particle physics to medical care. However, we don’t only cooperate with different material industries – we also often get to get to know and study what it is like to work in the industries, such as mining. A good example is the KaiVi project, during which the programmers and modelers from the lab went on internship to the mine to get acquainted with the mining operations. Such action is paramount when it comes to building environments that reflect reality. It is also important for the development team to have some understanding of the substance industry and thus be able to communicate effectively with professionals.

Multidisciplinary skills and learning

The full use of gaming technologies requires the cooperation of several professionals. Each project begins with designing and defining, involving substance experts, designers, programmers as well as artists. At this stage, pedagogical solutions to learning environments are also considered and service design tools are included. The actual implementations involve different stages depending on the project, but typically the project starts with conception and technology tests or prototypes, from which the transition to the actual product development takes place. Depending on the project, suitable tools are selected for the project, as well as the necessary methods for carrying out the project. We use the Scrum Method in several projects: the method enables agile development, and works very well in the current situation, where work takes place largely remotely.

We use new technological equipment and solutions in several projects. For example, we get to implement different virtual reality environments and also look for new solutions to take advantage of virtual reality. However, virtual reality is just one of the new technologies we take advantage of. In addition to VR, we can utilize artificial intelligence, machine learning, sensors, motion platforms and controllers in our projects, amongst other things. We are always striving to find new ways to leverage technology in our operations to get the best results.

While we each have our job roles in the lab, that doesn’t mean the programmer just needs to code, or the modeler to model. Depending on your own skills and state of mind, you can participate in various tasks in a variety of ways. There are project planning, articles, presentations, development assignments, internship guidance, technical leadership, workshops, webinars, and many other tasks important to our operations. Cliché, but true; no workday is similar and no project is similar.

Some of our staff also work as part-time teachers in ICT training. It is rewarding to be able to share our know-how with future engineers, and additionally, through it be a laboratory that makes its own contribution to the development of the area. In the upcoming summer 2021, we are once again involved in arranging the Summer Game Studies in the Lapland UAS. The planning phase is ongoing and the greatest Summer Game studies are on the way! Be sure to follow the FrostBit publications, as we will definitely report the mood from the SGS during summer.

Go check out our projects in the lab Portfolio

26.02.2021

SHARE IN

Toni Westerlund, Project Manager

There is no problem that Toni will not solve. Toni is a raw professional in software engineering and programming: “propeller hat” is certainly the correct name for Toni. Gaming technology and virtual reality are especially close to Toni’s heart. He also enjoys teaching, which is more subtle than sharing his own knowledge and raising new propeller hats. Toni leads FrostBit’s Game & XR Team and also works as an alumni manager for the ICT-field in LapUAS.

Master of Engineering

Visibility for Mining Industry with Gamification

There are as many as 46 mines in Finland, with more than 5,000 employees in 2018. Especially in northern Finland, the mining industry is an important employer for many and is an source of livelihood in areas where diverse employment and welfare development can be otherwise challenging (Kaivosteollisuus.fi). There is also a wide range of education possibilities of mining industry throughout Finland, which is why the reform of degree structures and the development of modern learning methods are topical due to the growing demand of the field.

Since mining is such a practical field, how could its education be modernized? The Migael-project answers this question with gamification and modern technologies. The aim of the project is to develop mining education by creating a virtual learning environment with a variety of scenario-based exercises. By the beginning of 2021, three exercises have been completed in the project and the fourth exercise will be finalized during February. Each exercise focuses on different “scenarios” in the open-pit-mine as well as in the underground mine. For each exercise, different platforms and technologies have been utilized to achieve the learning goals:

Charging/Blasting
In the first exercise, the player will be able to practice the charging of explosives in 2D view, as well as view the simulation of charging and blasting in both 3D- and Oculus Guest virtual views. In the exercise, the player gets to simulate charges of different sizes and see what their blasting effect would look like.

Workcard
The second exercise was also created to be played with Oculus Guest’s VR-glasses. In this exercise, the player will be able to do an occupational safety card inspection at an underground mine. The player is able to move around the underground mine and observe the necessary safety measures and fill in the occupational safety card accordingly.

Driving departure
The third exercise was created to be played on both Oculus Rift VR-glasses and Android phones, and this exercise will go through a mining vehicle run-in inspection. The player must check the worker’s safety equipment and the condition and usability of the mining vehicle before it can departure.

PRE- and POST-blasting safety measures
In the fourth exercise, the player performs the pre-blast and post-blast safety measures. The exercise was made into a more traditional 3D-desktop “serious game”, since it is possible to make more use of textual teaching material on a wide range of topics. The player gets to observe objects in the open-pit-mine environment to secure the area before and after the blast.

Different technical implementations enable different ways of presenting or simulating the subject to be taught. VR-technology is able to take a player or learner to a realistic learning situation that responds as much as possible to a real life situation. In this way, for example, training situations can be usefully simulated when the implementations in real life would be demanding, costly and often dangerous. Implementation on a mobile platform establishes easier learning regardless of time and place, and this can help to reach a wider group of users. Therefore, in the third exercise, for example, the driving departure was implemented for both VR glasses and mobile.

However, virtual glasses create their own challenge if the material and subject area is extensive and requires more than observation and “tangible measures.” While instructions for “floating texts” can be established on virtual glasses, interactive texts combined to player movement and other activity can create an unnecessary challenge in the VR-world. For this reason, the fourth exercise of Migael was created as a 3D-desktop game, as the topic required a lot of text in the form of teaching material and other information. In this exercise, it was necessary to plan particularly carefully which operations should and could be “gamified”, as there are several safety measures related to blasting. Some of the most generic and difficult operations were implemented in the form of cinematic transitions and instructions given by the game’s characters:

Thus, different technological implementations can enable different ways of learning and achieve certain learning goals. Some technologies allow for a more realistic simulation of exercises, while others allow for a broader presentation of the content and the possibility to reach a larger target audience. The exercises produced in Migael form a gameful learning environment where you can experience the benefits of many new technologies and learn about the mining industry in an inspiring and safe way. The aim of the project is to make the mining industry more visible and interesting, especially for students, and to establish important and safe training regardless of time and place.

In addition of the fourth exercise being completed soon, the early part of the year 2021 is eventful for the project: a Teams-webinar titled as ‘Utilization of modern technologies in the mining sector’ will be held on 25.01.2021 as part of the project. In addition to the FrostBit lab, the webinar will feature speakers from VTT and Kajaani University of Applied Sciences, who will present, amongst other things, the possibilities of gamification, VR / AR technologies and new sensor technologies in the mining industry. Mining staff, project staff and students of Lapland University of Applied Sciences will be invited to join and listen to the webinar.

The project’s activities will be updated on its official website, which includes a blog and the downloadable exercises: www.migael.fi

In addition, see the FrostBit portfolio-page of Migael: https://www.frostbit.fi/en/portfolio/migael-en/

The northen Ostrobothnia of Centre for Economic Development, Transport and the Environment has granted 337 502 € from the European Social Fund (ESF) and state funding for the Migael-project. The total cost of the project is 450,002 €.

References

www.kaivosteollisuus.fi/fi/kaivosala-suomessa

15.01.2021

SHARE IN

Tuuli Nivala, Specialist

Tuuli works as a specialist in FrostBit at the LapUAS. In her current role she utilizes her game pedagogical knowledge in game design, as well as participates in project planning and some 2D graphic design. She also does marketing and AV-media material for the laboratory.

Master of Media Education, Lapland University

Cross-platform mobile development, the Flutter experience

FrostBit Software Lab has followed the development of the Flutter platform (created by Google) with great interest for the last few years. Our laboratory has also actively applied Flutter in various projects to enhance the development of our mobile applications. The need for an efficient cross-platform technology has been great, since producing two separate native mobile applications (Android + iOS) has proven to be too expensive and inefficient for our purposes, whether considering the development or the maintenance phase.

The strength of native mobile applications lies in the possibility of total customization and the native access to the features of the mobile phone (camera, sensors etc.). On the other hand, the weaknesses of the native mobile application approach are the need for a sizable human resource and the maintenance challenges provided by the rapidly evolving mobile phone ecosystems.

Previously our laboratory has created applications by using the PhoneGap –platform, but the limitations it provides to mobile application design proved to be too great for the technical needs of the FrostBit Software Lab.

Google released the first version of Flutter in the year 2017, and it immediately sparked some interest within the developers at FrostBit Software Lab. At that time, we had two sizable mobile projects upcoming in our project calendar, so we bravely decided to try Flutter out in both projects to avoid the challenges that come with the development of two separate native mobile applications.

https://en.m.wikipedia.org/wiki/File:Google-flutter-logo.png

Going towards the year 2021, we are tentatively yet extremely positively surprised and optimistic when it comes the possibilities of using Flutter in creating cross-platform applications. We are certainly going to use it further in upcoming projects as much as we can.  Then again, nothing in this world is perfect, and Flutter also shares its weaknesses, despite all its strengths. For this purpose we have gathered our own experiences on Flutter from the past few years, and here’s our conclusion of what we think of Flutter so far:

Flutter strengths / pros:

  • Development is easier when compared to PhoneGap or Xamarin, for example, since Flutter needs less platform-specific code
  • Flutter is cross-platform; it’s possible to develop applications for Android, iOS, Windows, Linux, MacOS, web etc. at the same time
  • It’s fast to develop applications on Flutter
  • Flutter allows the developer to create complex and fully customized UI components, since everything related to layout can be altered
  • Flutter has good documentation and a great number of examples are available. Flutter’s userbase also grows rapidly all the time
  • Flutter is powered by a very performance efficient 2D-UI engine (sky_engine), which works well on a 120Hz display as well
  • Flutter uses the Dart-programming language, which is an easy-to-learn OOP language
  • Google is a huge organization, which has the resources to develop Flutter efficiently. Google also uses Flutter in some of its own products
  • Flutter-code will eventually compile into native application code on different platforms and devices
  • Flutter already has good development tools, for example, for Android Studio and Visual Studio Code
  • Flutter is rapidly being updated

Flutter weaknesses / cons:

  • Complex user interfaces can be tedious to develop
  • Since Flutter is updated frequently, a great number of changes are integrated into the platform continuously, which can make large projects difficult
  • Both Dart-language and Flutter are relatively new technologies, because of which they are also rapidly being updated. This problem will fix itself in the future though.
  • Many of the best features of Flutter are still in development stage (null safety, web and desktop application support etc. )
  • If Flutter does not have a plugin for a certain platform specific feature (Android or iOS), you will need to make platform specific code
  • Because of Flutter’s popularity and its relative easiness for application development, there are numerous third party plugins which can have questionable quality
  • Since Flutter is a new technology, so called “best practices” and recommended architectural designs haven’t been really formed or standardized

In conclusion, we think Flutter has way more strengths than weaknesses, most of which are related to the fact that Flutter is still a new technology.

If we also take a look into a few internet articles related to this subject, we can conclude that many other developers in the world have similar thoughts with us (e.g. Rozwadowski 2020;  Costa 2019; Sannacode 2020;  Powalowski 2019). According to these articles, Flutter weaknesses also include the large size of the mobile application as well as certain compromises that are related to the mobile application’s layout and user interface design recommendations of different platforms (e.g. material design on Android or iOS). Here at FrostBit Software Lab, we do not consider these weaknesses to be too problematic in our projects for now, however.

We have used Flutter in two major projects, first being “Arktori” and the other being “DWELL”. In the Arktori project, we are developing a mobile application, with which the rectors of the northern part of Finland can network with each other, as well as enhance their professional knowledge and also mentor each other in their daily work. In the DWELL project, we are developing a mobile application, which promotes and endorses communal living in apartment buildings. The pilot apartment building in DWELL project is the DAS Kelo –student dormitory. We have been quite happy with Flutter in both projects when it comes to mobile application development.

Some screenshots from the Arktori and DWELL mobile applications:

Arktori mobile application (Flutter), development version, December 2020
Arktori mobile application (Flutter), development version, December 2020
DWELL mobile application (Flutter), development version, December 2020
DWELL mobile application (Flutter), development version, December 2020

Flutter has proven to be a powerful tool to create cross-platform mobile applications. However, it remains to be seen, how flexible and efficient Flutter will be in the future when it comes to web and desktop applications. If Flutter will be truly competitive outside mobile applications, it will be a realistic scenario (from the software developer’s point of view) to concentrate mostly on Flutter when developing applications, only to be supported by other technologies in cases, where Flutter is not an optimal approach.

For now, we are not completely certain here at FrostBit if Flutter is the all-powering efficient application development platform for all kinds of applications, but no one can deny that they surely are trying hard to do so!

In any case, we are going to continue watching the development of Flutter with great interest, and are always ready to try out Flutter in new projects and use cases in the future!

11.12.2020

SHARE IN

Tuomas Valtanen, Project Manager

Tuomas works as the web/mobile team leader, a software engineer and a part-time teacher in FrostBit at Lapland UAS. His tasks include project management, project planning as well as software engineering and expertise in web, mobile and AI applications.

Master of Engineering


How can immersive technologies (VR/AR) foster meaningful education and travel experiences during the pandemic?

1. VR in teaching at Lapland University of Applied Sciences (Photo credits: FrostBit Software Lab)

Often, challenging times bring resilience and inspiration for creative solutions. Now more than ever, digitalization and technology have kept us going and connected; nevertheless, we all have experienced the effects that a long day of working or teaching with digital tools has on our mental and physical health. For numerous reasons, working and connecting through a flat-screen is simply not enough.  As humans, we need more; we need to connect in a meaningful way. These challenges and difficult times have fostered the rise of creative digital solutions, especially immersive technologies (virtual and augmented reality, VR/AR) since they enable us to travel without the need to leave our home or perform tasks that we will not be able to complete otherwise. Is this the right time for VR/AR to be the next digital tool and tech disruption that will elevate the way we work, connect, and travel? All the odds are in favor, as well as the hype.

How can VR/AR add value to education?

Education is definitely one of the areas that VR/AR technologies can create a significant impact based on the fact that most of the teaching and training is conducted remotely at the moment. The current teaching and learning methods through online lectures and classes lack interactivity and versatility; hence, immersive technologies can enhance the way we teach and learn. Think about the possibility that a teacher would put a VR headset and ‘teleport’ worldwide during the geography classes using Google Earth. Similarly with other school subjects, students can have a virtual visit, which is immersive and engaging, for instance, to the Colosseum of Rome during the history classes. A recent Stanford University study conducted with middle school students examined the differences in cognitive learning by comparing the usage of desktop and VR videos to teach about crystal reefs and ocean acidity. The study showed that the students who watched the VR videos had higher scores in learning compared to the students who watched the desktop videos. In addition, the students who experienced VR videos about the coral reef showed higher scores on the perception of self-efficacy compared to the group who watched the desktop videos.

2. Virtual visit of Florence, Italy through Google Earth (Photo credits: Google Earth VR, Steam)

One major drawback is the costly prices of such technologies, although the devices are becoming more affordable and of higher quality, as time passes. Besides, teachers in the universities and especially high schools, have limited access to the technology and VR/AR devices because specific R&D or research groups mostly use them. An additional significant factor that has prevented the widespread of VR/AR technologies in teaching relies on the fact that teachers are often scared to implement new technologies in teaching, ensuring that they complement the curriculum goals. What is the starting point then? Being curious and embracing innovation in teaching! In the following steps, have a general understanding of the technology and how you could utilize it in teaching. Ideally, if your school has a dedicated lab or environment for VR/AR, that is the first place to go. Ask for a general introduction and possibly request to borrow one device that you can explore by yourself. If that is not possible, then there is a lot of information about the role of VR/AR in education online. Moreover, if your organization does not have professional devices, you can start by experimenting with more affordable options in the market, for example, Samsung Gear VR. Entering the world of immersive technologies is still a confusing pathway but fortunately, the accessibility to the VR/AR information and technology is increasing day by day.

3. Student painting in VR

Which are the subjects that VR/AR can play a critical role in at the moment?

Healthcare

Due to the complexity of the situation regarding the physical distancing, healthcare is one of the sectors that need solutions that would enable remote collaboration and simulations. Several studies have proven the effectiveness of VR/AR in healthcare training and simulations; however, the employment of immersive technologies in healthcare is in the very early stages. According to the Gartner Hype Cycle for Digital Care Delivery including Telemedicine and Virtual Care, VR/AR for care delivery are ranked as ‘on the rise’ technologies, indicating that the real potential will evolve simultaneously with the technologies and applications. Therefore, the XR (extended reality, VR/AR/MR) simulations in the healthcare sector are estimated to have a market value of around 850 million euros by 2025, followed by a significant improvement of the technical and market value chains globally.

Considering the importance of immersive technologies in enhancing learning and safety in the healthcare sector, Lapland University of Applied Sciences has applied for international funding for a project which will create cutting-edge VR/AR solutions for healthcare education and medical staff. Our goal is to make healthcare training and simulations accessible to the students and professionals, enable hands-on experience which is safe and enhances learning. Another recent study indicates that when medical students train in VR, they scored better in all categories compared to the traditionally trained group, especially regarding information retention. The total test score showed an overall improvement of 230%. The future of healthcare training and simulations is surely based on immersive technologies.

Industry Training and Simulations

Similarly, with the healthcare sector, enterprise training and simulations is another critical sector where immersive technologies can be a solution for the disruptions caused by the pandemic. Long-term remote work has called for solutions for remote training and simulations. XR training and simulations for different sectors of industry is one of the key areas of expertise at the FrostBit Software Lab. We have developed VR training and simulations for the mining industry, renewable energy production, reindeer herding, real estate, and forestry.

4. Creating the digital twin of the district heating power plant (Credits: DUKE project)

The on-going DUKE project will develop a digital twin (digital representation) of the district heating power plant in Jänkätie, Rovaniemi, which will give students and new operators hands-on experience on the operation of the plant without the need to physically be there. A cost-effective and safe solution, which will make learning more accessible and effective for the students.

Tourism

The tourism sector in Lapland has been massively impacted by the pandemic and the negative effects are predicted to be long-term. While hoping for a better future, very few companies have shifted their attention towards innovative and creative solutions to sell their experiences. That might be due to the lack of awareness about the possibilities on the market, technology readiness, or the costly technological solutions. This is a time when the travel industry should adapt and look for creative and innovative solutions to reach their customers. XR solutions will not solve all the problems, but they will give hope and new markets. How can we create an immersive experience of Santa Claus Village, Northern Lights, or Lapland Landscapes without the need to physically be here? Surely, it is a complex task but the technology is here. We have to step up and implement it. The Amazon Explore platform was launched to provide people with virtual traveling experiences. You can pay to have a tour with a private guide in different cities of the world, such as having a virtual walking tour through Mexico’s city urban art scene. The experiences are video and desktop-based; thus, not immersive. Traveling experiences are immersive and VR leads to immersion.

Fostering distance collaboration through immersive technologies – the Arctic perspective on XR at VR Days 2020

This year, we were invited to share our XR expertise and use-cases at the most comprehensive VR/AR event in Europe which this year was held remotely. VR Days blends different fields in which immersive technologies are applied such as business, art, training and simulation, education, hardware, funding etc. This was a major step for FrostBit Software Lab and Lapland University of Applied Sciences to feature among the most influential individuals and companies of the XR industry such as Oculus, HTC, Facebook, Google and much more. As the first Finnish VR laboratory, FrostBit has a long history of solving-real life challenges with immersive technologies. For instance, creating a virtual graveyard experience for Salla’ museum of War and Reconstruction which would allow the visitors to access the German soldier graveyard located in the Finland-Russia border area. The VR experience enables an authentic graveyard visit without the need to go through the border control between the two countries. Check out the speech for VR Days 2020 below:

Erson Halili speaking at VR Days 2020 New Horizons (Credits: FrostBit Software Lab)

What to consider when planning and creating immersive experiences?

The quality and the content of the VR/AR experience can vary on the desired outcome of the specific experience; however, there are certain key steps that are critical when planning a VR/AR project. Most importantly, the combination of engineering, psychology and education are essential in creating meaningful VR/AR experiences. My perspective when planning XR experiences: combining cognitive psychology, media education and user-experiences. What makes a VR experience meaningful? Consider these tips:

  1. The XR experience should solve real-life problems. Although that might not always be the case, technology will solve a problem that is not possible to be solved otherwise.
  2. Employing user-centric design. The XR experiences should emerge from the users and carefully designed with them.
  3. Strive for meaningful experiences. Meaningful experiences engage the user and there is a clear intended outcome at the end of the experience.
  4.  Make sure you combine multi-disciplinary teams and skills. As stated above, combining multi-disciplinary teams will possibly ensure that the three other points are considered. Multi-disciplinary teams and skills lead to holistic user-centric experiences. That is a core strength we have here at FrostBit, where we proudly co-work with a diverse team of engineers, educational specialists, designers, artists, etc.
6. Visitors experiencing VR at FrostBit Software Lab

We, here at FrostBit Software Lab (Lapland UAS) are on a mission to make technological solutions accessible to the community. Therefore, we are organizing info sessions with teachers, educational specialists, decision-makers, and companies in Rovaniemi and Lapland region on how to utilize immersive technologies during the pandemic. We want to give our support to overcome the barriers caused by the pandemic and we believe that immersive technologies are a powerful assisting tool. Are you ready to explore VR/AR technologies and understand how it can be of help in your subject or business? Send us a message in advance to have a personalized meeting and demo. For teachers at the Lapland UAS and the University of Lapland, you can pop-up anytime at the FrostBit lab facilities.

References

Hakkennes, Sh., Craft, L., Jones, M. (2020). Hype Cycle for Digital Care Delivery Including Telemedicine and Virtual Care. Retrieved on November 15 from: https://www.gartner.com/en/documents/3988593/hype-cycle-for-digital-care-delivery-including-telemedic

Muller Queiroz, Anna Carolina & Nascimento, Alexandre & Tori, Romero & da Silva Leme, Maria. (2018). Using HMD-Based Immersive Virtual Environments in Primary/K-12 Education. 10.1007/978-3-319-93596-6_11

Pottle J. 2019. Virtual reality and the transformation of medical education. Retrieved on November 12 from https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6798020/

Written by Erson Halili

18.11.2020

SHARE IN