Cultural heritage represents the historical and cultural achievements of a nation, playing a vital role in studying human civilization and preserving national languages and scripts. This study utilizes virtual simulation technology to design a virtual pavilion for Chinese language and writing, employing image and text feature extraction algorithms for feature fusion and 3D modeling. The effectiveness of Chinese character extraction is validated through feature point matching, while the virtual exhibition’s impact is assessed via user experience scores. Results indicate that the proposed algorithm achieves accurate extraction with no misrecognition. User interest rankings highlight text images as the most influential factor, followed by visual imagery, pavilion experience, scene art, and language culture. Analysis of user feedback shows an average experience score exceeding 60 points, confirming the pavilion’s effectiveness in preserving and promoting Chinese language and writing culture.
Intangible cultural heritage is closely related to people’s lives and refers to practices, performances, expressions, knowledge and skills, as well as their associated tools, objects, artifacts and cultural spaces, which are recognized by groups, communities and individuals as their cultural heritage to be transmitted from generation to generation [12,2,5]. Intangible cultural heritage is a collection of ideas of each local folk culture and art creators, unique artistic achievements tested by time, and is a valuable asset left by the development and practice of human society [14,7]. In the past, based on the background of the times, the underdevelopment of science and technology has led to fewer channels for people to obtain forms of entertainment, and traditional arts are very popular with the general public, but now there are more or less intangible cultural heritage inheritance faults or even disappearance of the problem. With the addition of virtual reality interactive technology, the dissemination of intangible cultural heritage will be faster, and the dissemination effect will be more obvious compared to the creation of unilateral access to information forms, effectively solving the problem of difficult inheritance [].
As an intangible cultural heritage in the process of human civilization development, Chinese language and writing are the treasures of Chinese culture and an important carrier of Chinese literature, and the richness and polysemy of the Chinese language provide a broad space and unlimited imagination for the creation of Chinese literature [15,10,6]. Chinese language and writing have rich cultural connotations, and one can gain a deeper understanding of the connotations and development of ancient Chinese culture by studying the forms, meanings, and development history of Chinese characters [16,11,8]. Chinese characters also contain rich cultural symbols and meanings, the study of which can reveal the deeper connotations of ancient Chinese culture [4,3]. Therefore, the inheritance of Chinese characters ensures the inheritance of the artistic charm and cultural value of literary works, which is of great significance to the Chinese nation.
In this paper, a virtual pavilion of Chinese language and writing is designed through virtual simulation technology to provide an immersive experience with the help of digital technology to promote the dynamic inheritance and protection of language and writing. The study adopts Unity 3D technology and image text fusion technology to construct a 3D model of the virtual pavilion, and analyzes the recognition effect of the proposed image text feature extraction algorithm. The cultural inheritance effect of the virtual exhibition hall is concluded from the users’ interest in the virtual exhibition hall of Chinese language and their own experience feedback results.
Firstly, at the level of user experience design, the first-person perspective is adopted to fix the user’s visual focus, increase the user’s sense of immersion and presence experience, and ensure that the user can maintain a positive mental exploration of the virtual pavilion.
Secondly, at the level of visual guidance, the two-dimensional first of all to ensure the quality of the mapping in the space so as to restore the realism of the scene as much as possible, but also need to ensure the consistency of the UI interface and the entire pavilion; modeling needs to be fully foreseen in the content of the interaction, the three-dimensional model to ensure the integrity of the space, and if necessary, to carry out artistic processing.
Finally, at the level of content narrative, design the visiting route in line with the user’s cognitive logic, and at the same time, not too straightforward, and appropriately create “surprises” to enhance the user’s enthusiasm for obtaining information, but also to ensure the correctness of the user’s interactive logic.
In addition to the above three points, but also fully mobilize the lighting and sound to enrich the design of the venue, increase the atmosphere, to provide users with a more favorable visiting experience.
The realization stage of the Chinese language and writing virtual display system integrates the plane visual content and 3D model into a scene, and then completes the design of the system interactive content through coding, so as to realize the complete function of the Oracle virtual display system, which is the process from the conceptual theory to the actual effect display.
First of all, according to the design architecture of the pavilion and the main story line to sort out the material induction partition; then complete the graphic display design content, covering the exhibition wall design, UI design, mapping processing, design elements, etc.; Next, according to the historical materials and historical scholars’ research, the use of H-dimensional software to produce a restoration model of the Yinshang Palace in line with the study, and according to the characteristics of the building, the design architecture, and to improve the design of the relevant exhibits in the virtual pavilion, Next, we use H-dimensional software to create a restoration model of the Yin Shang Palace in line with this study, and according to the architectural characteristics and design structure, we improve the design of related exhibits, games, decorations and other models in the virtual exhibition hall, and integrate the three-dimensional space; then, we improve the UI interface in C4D, and preset all the interactable interfaces; we pack and export Unity3D, and according to the layout of the interface and the function modules, we improve the human-computer interaction, roaming, merging and collision and other function settings, and then we pack and release the apk file and export the VR glasses to conduct the function test.
Scene immersion function: realizing scene immersion can satisfy the users to walk, watch and learn in the scene in the way of first-person perspective, through this function, the users can have an all-around viewing and learning of Chinese language and characters.
Display function: display includes the realization of immersive action display and scene display function, light and sound simulation of reality.
Virtual reality technology as a multidisciplinary fusion technology, the use of computer graphics, signaling and other technologies, to create and the real world as the same visual, auditory, tactile virtual environment. Users interact with the virtual environment, the user’s relevant action data to the computer simulation of the virtual world, the virtual world perceived to make feedback, produce feedback signals to the user [9]. With a mathematical model to represent the virtual reality technology as:
Definition
To make the transmission of
The transformed user data is denoted by
The working machine then reconstructs the virtual scene using the
Virtual reality technology as a mapping function, can be the input
data scene, the real world behavior as the input side of the function,
the virtual scene changes as the dependent variable as the output side
of the function, the virtual scene will change because the user produces
changes and make corresponding changes, the expression is:
Scene modeling process: three-dimensional modeling software settings
1) In order to maximize the appearance of the model, try to reduce the number of surface pieces of the model surface so that the model surface is fully simplified.
2) Ensure the rationality of the model mesh distribution.
3) When modeling, it is necessary to align points with points, lines with lines, and when welding the model, pay special attention to the precise correspondence of the welding points, so as to ensure the stability of the structure.
4) Reasonable use of mapping to ensure the appearance of the 3D model.
Environment modeling includes building modeling, ground modeling, and plant modeling. Building modeling adopts polygonal modeling method, and after establishing a simple model, texture mapping ensures the sense of realism. Plant modeling calls the vegetation resource package in the virtual reality engine and imports the prepared vegetation into the resource package of the virtual reality engine. In the virtual reality engine, the vegetation file is added to the ground through PlaceTress. When importing the plant model, design and plan the parameters of the size, position and number of plants.
Artificial roaming refers to the user through the manipulation of the keyboard, mouse and other input devices to realize the arbitrary roaming in the three-dimensional scene. Through artificial roaming, users can flexibly and accurately observe the scene in all directions, pick up virtual objects and obtain information. The process of artificial roaming is a process of continuously changing the viewpoint position or view direction and rendering the scene according to the interactive control commands. The interaction control between the user and the scene mainly consists of basic actions such as moving forward, backward, moving left, moving right, looking up, looking down, turning left, turning right, ascending, descending and so on.
Forward and backward simply move the point of view a certain distance
along the line of sight, the direction of the line of sight remains
unchanged. Let the coordinates of the point of view before the
transformation be
Left shift and right shift are just translations of the viewpoint
along the
Looking down and looking up refers to the process of changing the
scene that the user sees with his head down and his head up. Looking
down and looking up do not change the position of the viewpoint, but
only the direction of the line of sight changes so that the direction of
the line of sight is rotated by a certain angle around the x. axis of
the viewpoint coordinate system. Let the angle of rotation be
Turning left and turning right refers to the process of scene change
when the user turns left and turns right. Turning left and turning right
does not change the position of the viewpoint, but only the direction of
the line of sight changes, so that the direction of the line of sight
rotates around the y. axis of the viewpoint coordinate system by a
certain angle. Let the angle of rotation be
The calculation of
Ascending and descending only increase or decrease the height value
of the viewpoint (y coordinate), while the direction of the line of
sight remains unchanged, i.e.,
The above cases are the basic ways for users to interact with the scene, and in practical applications, it is often a combination of the above ways to show the user a broader view.
Auto roaming is the process of pre-setting the roaming path so that the viewpoint travels along the roaming path, thus achieving the purpose of roaming in the 3D scene. The roaming path can be set in the scene plan by clicking the control points with the mouse. Users use the mouse to pick up a series of control points on the floor plan, and specify the elevation and speed of each control point, and then through the 2D screen coordinates to the 3D scene coordinates of the conversion to get a 3D space control point sequence. A roaming path is a curve in three-dimensional space, which is determined by the control points according to a certain interpolation method.
A common method for collision detection in virtual scenes is to use a wraparound box pseudo that wraps around the object object. The basic idea of the enveloping box method is to use simple geometry instead of complex geometry. First of all, a rough detection of the object’s enclosing box, when the enclosing box is intersected by the enclosing geometry may intersect when the enclosing box is not intersected by the enclosing geometry must not intersect, so that a large number of impossible to intersect the geometry and geometric parts, so as to quickly find the intersecting geometric parts.
Typical types of bounding box are: bounding sphere, bounding box AABB
along the axes, directional bounding box OBB and so on. The choice of a
wraparound box usually depends on its application area and the different
constraints it implies. For collision detection algorithms, they can be
analyzed with the help of a consumption function.
By summarizing and summarizing the contents of the previous chapters, a preliminary understanding of the overall processing flow of language and text image processing can be made, and the specific process schematic is shown in Figure 2:
For each step specific descriptions are as follows:
The research samples in this paper are all acquired by using high-definition shooting, thus reducing the interference factors in the captured language text images and thus highlighting the text edge features in the images. In addition, according to the carving characteristics of the language text, the image is acquired by oblique lighting from the top, bottom, left and right directions of the same language text.
This paper first analyzes and discusses the collected images, analyzes the characteristics and problems of the language text images, and then carries out the language text image denoising, single word image segmentation and basic text skeleton extraction.
On the basis of language and text image preprocessing, according to the theoretical methods of mathematical morphology and wavelet transform, the text outline in the language and text image is extracted using the backbone extraction algorithm based on Chinese character strokes.
A simple image fusion operation is carried out on the language text images after font outline extraction, i.e., the language text images of the same target in the four oblique directions of the upper, lower, left and right lightening are processed and fused into the image, so as to obtain a more complete binary image of the font outlines. Finally, manual checking is carried out to delete a small number of unreasonable areas to get the language text results.
For the three-dimensional display of the text outline of the extracted language text image, a three-dimensional world museum scene model is first constructed, and then the acquired language text is embedded for three-dimensional display.
Digital cameras shoot color images, and text extraction does not require color information, so the first color image grayscale, and then make a low-pass filter. Because of the text to try to retain the information, especially the text edges, so here can only do a slight low-pass filtering.
Because oblique light may cause uneven illumination of the image, so
it is necessary to apply homomorphic filtering to the image. The process
of homomorphic filtering is as follows: for an unevenly illuminated
language text image
Then there is
The process of text outline extraction for language text images can be regarded as a process of information extraction, and for preprocessed language text images, it is the process of extracting text outlines from the background with noise to get new language text images. For the research samples in this paper, on the basis of the main extraction algorithm of existing research, combined with the analysis of the text structure in the language and text images in the previous paper, the original algorithm is improved, and an outline extraction method based on the stroke characteristics of Chinese characters is proposed.
The algorithm in this paper completes the extraction of text outline in three steps, the specific steps are as follows: the first step, the determination of the width of the horizontal, vertical, apostrophe, and downstroke of Chinese characters, which is to utilize the principle of wavelet multi-scale edge detection of the image to carry out the wavelet analysis of the text outline in the language image, and then to determine the width of the corresponding Chinese characters; the second step of the main extraction algorithm, which is to filter out the noise and details together, and to get the noise and details from the language image, and the main extraction algorithm is to filter out the noise and details together, and then to extract the text structure. The second step of the trunk extraction algorithm is to filter out the noise and details together to get the trunk of the text; the third step utilizes the trunk expansion algorithm, the method is to take the trunk of the text obtained in the second step as the starting point, and based on the widths of the Chinese characters determined in the first step in the horizontal, vertical, apostrophe, and downward widths of the Chinese characters, the details of the text that satisfy the conditions are included to get the whole outline of the text.
In this paper, the smooth function chosen is the cubic B-spline function, and thus the obtained quadratic B-spline wavelet with tight branching set is characterized by odd symmetry and no time or space offset, which in turn can be used to detect the edges of the Chinese characters in the image, and the specific implementation process is shown as follows.
For the grayscale figure
The transformation formula in the column direction is:
When
For the process of extracting the text trunk of the image, the
essence is the process of using structural elements to perform corrosion
operations on the language text image. In the actual experimental
processing, the appropriate structure operator can be selected for the
operation according to the width of the character backbone and the width
of the image noise lines. It is set that in the language text image, 0
is used to denote the white background, and 1 is used to denote the
black pixel point corresponding to the text and noise points in the
image. Based on the analysis of the image noise of the study sample, the
3
In this paper, four binary language and text images of the same object are fused, and in the specific processing, the four images are first divided into two groups of two binary language and text images, respectively, up and down diagonal lightening language and text images as a group, and left and right diagonal lightening language and text images as a group; and then the corresponding image fusion is carried out for each group: finally, the binary language and text images of the two groups are fused again, and the overall effect is better. Finally, the two groups of binary language and text images are fused again to obtain a better overall effect of the language and text images.
According to this paper, the fused images are obtained from different angles for the same object, in the process of image fusion, this paper adopts the control point correction method, i.e., through geometric correction, to eliminate the differences of the language and text images in the scale, translation, rotation, etc., and then the binary language and text images will be subjected to the phase-or operation at pixel level to obtain a synthesized binary language and text image.
VAML as a new modeling language, its main role is used for modeling three-dimensional virtual scenes, which is closely related to multimedia communication, virtual reality and other technical fields. Virtual reality is the ultimate goal of combining with multimedia technology, the biggest feature of virtual reality is expressed in the fact that it allows people to be in a virtual realistic three-dimensional world, can interact naturally with this virtual world, and can change the point of view arbitrarily. Bring the interaction into the, so that visitors can interact with it, there is an immersive intuitive feeling on the exchange of documents, description of the standard, based on the realization of the virtual reality environment provides a viable solution [].
First, we build various 3D objects in 3DStudioMAX, and then export these objects in .wrl format into a VRML world, edit their positions and relationships in the virtual reality world to form a unified whole world. Since the rectangular language text belongs to regular geometry, we use direct programming vrml code to generate the geometric model of the language text, and then add texture mapping for the texture node of VRML.
For any input text image, the algorithm extracts its feature points and traverses the comparison with the language and text database, and if the number of matching feature points is greater than the set subordinate judgment threshold, then it determines that the input text image matches the text in the standard library. Therefore, the setting of the judgment threshold determines the accuracy of the recognition judgment, with a high rate of omission if the threshold is too large and a high rate of misjudgment if the threshold is too small.
Through the comparison and analysis of repeated experimental data, the judgment threshold of the feature point of text image subordinate recognition is set reasonably. First of all, from the text data information standard library randomly selected sample set of 355 words, a total of 3267 text variant images for the training of feature point extraction, each word sample set to extract an image as a test sample shown in Figure 3. The average number of feature points of a single language text image in the figure is 82.
The language text images in the test sample set are removed in the standard library, the language text images in the sample set are compared with their variants of the language text images for feature point comparison respectively, assuming that the language texts in the sample set are all unaffiliated in the standard library, traversal feature point matching is carried out in the sample set with the standard library after the removal of the data, and the matching result is shown in Figure 4. From the figure, it can be seen that the highest number of feature points for language text matching each other is 11.
According to the above analysis results, the feature points with matching thresholds of 8, 10 and 11 are selected and tested respectively, and the test results are shown in Table 1. According to the analysis of the table, it is concluded that in the process of matching the feature points of the language text image, the number of feature points that fail to match the original image is not greater than 11, and when the threshold value is greater than 11, it can accurately match the original image, and the leakage rate of the original image is almost 0, and the leakage rate of the variant is 99.5%, which ensures that the algorithm will only accurately match a pair of images of the language text variant and differentiate the variant under the same prefix, so in the highest accuracy and low leakage rate, the threshold value of 11 is the optimal value. Using the set feature point matching threshold 11, no misrecognition occurs under this set threshold.
Threshold selection | Original missing rate | Error selection rate | False rate of variation | Leakage rate of variant |
---|---|---|---|---|
8 | 0% | 25% | 93.4% | 81% |
10 | 0.4% | 4% | 31% | 98.2% |
11 | 0.6% | 0% | 0% | 99.5% |
The proportion of interest in the exhibits in the scene when 500
people visit the virtual scene is shown in Figure 5, which shows that
users are interested in the text image itself, and the order of interest
from high to low is as follows: text image (12)
This section conducts a quantitative study of user experience through the virtual pavilion of Chinese language and writing constructed in Chapter III. The questionnaire adopts a percentage system with five scoring intervals and sets corresponding scores, and the scoring intervals are shown in Table 2.
First of all, the experimenters entered the “Chinese Language and
Writing Virtual Pavilion” on the PC to browse freely, and then rated
each question according to the questions on the questionnaire after they
finished browsing. The arithmetic average and weighted average of each
evaluation index were calculated as the results of user experience
evaluation. The calculation results are shown in Table 3. As can be seen from
the table, the arithmetic mean scores of the five dimensions are in the
following order: cultural experience (95.34)
Very bad | Bad | General | Good | Great |
---|---|---|---|---|
50-60 | 60-70 | 70-80 | 80-90 | 90-100 |
Evaluation dimension | Evaluation index | Index average | Index weight | Index weight average score | Dimensional arithmetic Average score | Dimensional weighting Average score | |
---|---|---|---|---|---|---|---|
Sensory level | Aesthetic experience | Color collocation | 85 | 0.059 | 6.81 | 78.56 | 25.11 |
Font design | 83.53 | 0.0667 | 6.51 | ||||
Icon symbol | 74.2 | 0.0632 | 6.48 | ||||
Page layout | 65 | 0.0562 | 6.17 | ||||
Relic pattern | 86.37 | 0.0481 | 6.82 | ||||
Interaction level | Immersive experience | Ease of use | 79.37 | 0.0653 | 6.94 | 83.21 | 16.77 |
usefulness | 91.87 | 0.0504 | 6.38 | ||||
autonomy | 77.87 | 0.0291 | 7.87 | ||||
Interactive experience | System flow | 89.03 | 0.0571 | 6.83 | 83.52 | 14.41 | |
interinteract | 77.53 | 0.0573 | 6.24 | ||||
Timely feedback | 83 | 0.0543 | 6.33 | ||||
Reflective level | Emotional experience | novelty | 90.2 | 0.0605 | 8.06 | 86.23 | 23.56 |
kindred | 83.03 | 0.0674 | 7.33 | ||||
Pleasure sense | 87.2 | 0.0536 | 6.44 | ||||
satisfaction | 87.2 | 0.0482 | 6.83 | ||||
Cultural experience | Scene sense | 93.03 | 0.066 | 7.84 | 95.34 | 15.76 | |
identity | 96 | 0.0485 | 6.43 | ||||
Sense of value | 96.53 | 0.0491 | 6.48 |
Combined with the weight values of each dimension in Table 3, the
weighted average score is calculated as follows: aesthetic experience
(25.11)
From the model, you can also get the weight of each evaluation index under each dimension. Taking the sensory level as an example, the weights of the five second-level evaluation indicators in this dimension were 0.18, 0.25, 0.14, 0.23 and 0.2, respectively, indicating that the importance of “color matching” and “cultural relics pattern” indicators in this dimension was slightly lower than that of “font design”, “page layout” and “icon symbol”, indicating that these two indicators are not rigid indicators to improve aesthetic experience. The weights of each indicator dimension are shown in Figure 7, Figure 8, and Figure 9.
The model also shows the importance weight of each evaluation indicator in the whole model as shown in Figure 10, with the indicators on the horizontal axis. In the figure, the weight value of “color matching” is 0.0799, which is one of the 18 evaluation indicators with the highest total weight, indicating that it has a great impact on improving the overall user experience.
The User Interaction Satisfaction Scale was adopted and appropriately modified and partially deleted, and the modified scale was divided into 5 parts with a total of 20 questions, and the score of each question ranged from 1-7, with an average score of 4. By calculating the ratings of each part of the questionnaire, the average of the ratings of each part of the scale of the 8 subject users after the completion of the statistics is shown in Figure 11. The ratings of each part of the questionnaire of the subject users were higher than 5 and significantly higher than the average rating of 4. The highest rating was given to the system capability, which indicated that the users were satisfied with the cultural dissemination of the virtual pavilion of Chinese language and scripts, the use of the virtual technology, the pattern design and the color, and other aspects of the virtual pavilion.
In order to inherit and protect the intangible cultural heritage of Chinese language and writing, the research puts forward the concept of creating a virtual exhibition hall of Chinese language and writing, and designs the visiting route, design structure and development process of the virtual exhibition hall of Chinese language and writing based on user analysis. In the mid-term practice process, the image text extraction algorithm and 3D design software are used to complete the design of the virtual exhibition of Chinese language and writing, and the Unity 3D engine is used to improve the interaction module and the integration of the content of the exhibition space.
From the analysis of the effect of image and text feature extraction algorithm, in the process of matching the feature points of language and text images, the algorithm can accurately match the original image, the leakage rate of the original image is almost 0, and the leakage rate of the variants is 99.5%, and there is no case of misrecognition.
According to the feedback of the user experience results, the average of the total experience scores of the eight subjects exceeded 60 points, and the experience was effective. It shows that the users are satisfied with the effect of spreading characters and culture in the virtual exhibition hall of Chinese language and writing, and at the same time, it also plays a role in the inheritance and protection of the intangible cultural heritage of Chinese language and writing.
1970-2025 CP (Manitoba, Canada) unless otherwise stated.