How Neural Networks See Texture: A Heartfelt Guide For Lovers Of Handmade Materials
When you run your fingers along a hand-carved wooden box, or tug gently on the fringe of a woven scarf, you are really reading texture with your fingertips. As an artful gifter, you probably think in fabrics, grains, glazes, and finishes long before you think in pixels. Yet the moment you photograph your creation, upload it to your shop, or explore AI tools to visualize new designs, a neural network becomes part of that story too.
Neural networks do not feel wool vs. linen the way we do. They “feel” through light and shadow, tiny repetitions, and surprisingly poetic statistics. Understanding how they perceive material textures is not just a technical curiosity; it can quietly transform how you photograph, design, and even authenticate your handmade and personalized pieces.
In this guide, I will weave together experience from the studio with insights from computer vision research in journals such as Multimedia Tools and Applications, PeerJ Computer Science, Sensors, Scientific Reports, and Metals. The goal is simple: to give you an intuitive, practical sense of how these systems see sand vs. silk, grain vs. glitter, and how you can collaborate with them without losing the soul of your craft.
Why Texture Matters In Artful, Sentimental Objects
Texture is the emotional temperature of an object. It helps a mug feel cozy, a notebook feel luxurious, and a ring box feel heirloom-worthy. In everyday language we talk about smooth, rough, velvety, crackled, knitted, or hammered surfaces. Texture is often the first thing your customer imagines when they read your product description, long before the package arrives at their door.
Researchers in computer vision describe texture in a surprisingly similar way. A research article synthesized on Scribe describes texture as repeating patterns of local variations in image intensities, characterized by coarseness, contrast, directionality, regularity, and roughness. They also distinguish between tactile texture (how something feels) and visual texture (how it looks). Neural networks only ever see the visual side, but that visual side is usually the cue our brains use to guess the feel.
A chapter in an MIT vision textbook on texture points out that when we look at a wall of bricks or a field of pebbles, we rarely count individual pieces. Instead, our brains quickly summarize the region with statistics like “very regular,” “speckled,” or “stringy.” This statistical view is key. Neural networks also lean heavily on statistics, not on recognizing individual fibers or grains one by one.
When you sell a handwoven blanket or a ceramic keepsake, your photos are asking both humans and algorithms to make a decision: “Do I trust this? Does it match the search query for rustic wood, soft cotton, or matte stone?” Understanding how texture becomes data helps you present your work faithfully and keeps those tiny, sentimental details from getting lost in translation.

From Wood Grain To Pixel Patterns: What “Texture” Means To A Neural Network
In handcrafted work, texture feels continuous. In a computer, texture becomes a grid of tiny brightness values. A neural network sees a woven runner as a pattern of light and dark points, not as “linen with love woven in.” What matters most to the network are the local relationships between those points.
Researchers often talk about texels or textons, conceptual building blocks of texture such as tiny junctions, corners, or repeating motifs. A PeerJ Computer Science paper on regular texture recognition describes textures as being made from atomic units called textons, arranged either in regular grids (bricks, tiles, fences) or more irregular layouts (grass, foam, weathered stone). Regular textures have clearly repeating structure; irregular textures feel more random.
For neural networks, these atomic units are not physical threads or grains, but consistent visual micro-patterns. The network studies how often they appear, how they are arranged, and how they change across the surface. That is why a subtle linen weave, a heavy burlap, and a polished marble can look quite distinct to a model, even when they share similar colors.
From your perspective as a maker, that means every choice in your photography that changes the tiny contrast patterns—lighting, focus, camera distance, even the background fabric—will change how a neural network “reads” your material. Before we get to practical advice, it helps to see how different families of models look at texture.

Classical Texture Descriptors: Early Ways Of Teaching Machines To Feel Surfaces
Long before deep neural networks, computer vision researchers developed handcrafted descriptors, little recipes for turning pixel neighborhoods into texture fingerprints. Several of these still matter because they are either used inside modern systems or perform surprisingly well on certain kinds of textures.
One foundation is the Local Binary Pattern (LBP), described in tutorials such as the texture-classification exercise credited to DJ Lee at Brigham Young University. LBP compares each pixel to its neighbors. If the neighbor is brighter than the center, it writes a one; otherwise it writes a zero. Those bits around the pixel form a code that captures a tiny edge or bump pattern. In that exercise, the authors use a rotation-invariant LBP that reduces all possible local patterns to 36 canonical codes, then feed their histograms into a Support Vector Machine (SVM) to distinguish textures like sand, seeds, and stone. They even recommend adding a new material class such as wood or grass, retraining, and analyzing the confusion matrix to see which textures the model confuses.
Another classical family uses grey level co-occurrence matrices, or GLCMs. A GLCM counts how often pixel pairs with gray levels i and j occur at a certain distance and direction. From this, Haralick features such as contrast, entropy, and homogeneity summarize the texture’s overall feel. A Scientific Reports case study on cereal identification uses twenty different GLCM-based features as inputs to a multilayer perceptron to distinguish wheat, barley, and rapeseed kernels flying through pneumatic seeding tubes. Even as the seeds rush by in a transparent tube, the subtle textures on their surfaces carry enough information for a neural network to tell them apart.
Gabor filters are another classic tool. A study on segmentation of textures on flat vs. layered surfaces, archived in PubMed Central, modeled early visual cortex by convolving texture images with a bank of oriented Gabor filters at several spatial frequencies. From these responses they computed three matrices: energy (edge strength), dominant orientation, and dominant spatial frequency. These became the inputs to simple multilayer perceptrons trained to detect whether there was a texture boundary in the image. One network saw only static 2D arrangements; another saw sequences simulating 3D motion and occlusion, echoing how our own vision uses self-motion to detect layers of material.
Even texture synthesis methods like the Heeger–Bergen algorithm, described in the MIT vision text, build on similar ideas. That method decomposes an image into multiple orientation and scale bands using a steerable pyramid, then matches the histograms of filter responses to generate new samples of the same texture. Starting from noise, iteratively matching subband statistics produces surprisingly convincing wood grain, gravel, or woven patterns.
For artisans, these classical descriptors mirror what you do by eye. You notice edge sharpness, direction of fibers, regularity of repeats, and overall “busy-ness.” LBP and GLCM capture local patterns and global statistics; Gabor filters capture oriented ridges and waves. They are fast and compact, but they need a human to decide which filters or statistics to use. Deep learning automates this feature design.
To ground the comparison, here is a compact table connecting classical descriptors to creative practice.
Method |
What it really “sees” |
Pros for material textures |
Limitations for makers’ images |
LBP |
Tiny binary edge patterns around each pixel |
Simple, fast, robust to lighting shifts; good for fine grains |
Can miss large-scale structure; sensitive to noise |
GLCM / Haralick |
Joint statistics of pixel pairs in set directions |
Captures overall roughness or smoothness; used in seed sorting |
Needs careful parameter choice (distance, direction) |
Gabor filters |
Oriented ripples at several spatial scales |
Resembles how V1 cells respond; good for stripes, weaves |
Requires manual filter design and post-processing |
Heeger–Bergen |
Histograms of multi-scale filter responses |
Can synthesize convincing textures for graphics or mockups |
Struggles with long-range or very structured patterns |
Deep Neural Networks: When Textures Become Stories
Deep convolutional neural networks (CNNs) changed texture analysis by learning features directly from pixels. Instead of explicitly computing LBP or GLCM, a CNN learns filters that serve a similar role but adapt themselves to the training data.
A study in PeerJ Computer Science focusing on regular texture recognition describes how classic CNN architectures such as Inception and ResNet can classify textures as regular or irregular with about ninety-eight percent accuracy on a curated dataset of 2,460 images. The authors built a balanced database that distinguishes strongly periodic textures like tiles and fences from irregular textures like rough stone or foliage. By fine-tuning only the output layer, they showed that networks originally designed for object recognition transfer well to texture regularity.
Interestingly, they also explored Fisher Vector pooling of CNN filter responses, which aggregates local activations into high-dimensional descriptors. These pooled representations, paired with standard classifiers such as SVMs, carry enough information to perform well on other tasks like segmentation and image retrieval. In practice, that means a network trained to tell “tidy grid” from “messy scatter” can also help you separate neat woven patterns from painterly marbled glazes in your product catalog, as long as the underlying statistics are similar.
Another perspective comes from neuroscience-inspired work summarized in a recent article on texture recognition that combines deep learning with regional features. That paper notes that responses to naturalistic textures grow stronger from early visual areas to higher-level areas such as V2 and V4 in the primate brain, and that CNNs show similar patterns: texture selectivity appears in intermediate layers. Deep CNNs like ResNet and GoogLeNet tend to outperform handcrafted descriptors on many non-stationary texture datasets, where statistics change across the image. However, the same paper observes that for stationary textures, where statistics remain constant, simple GLCM features plus a nearest-neighbor classifier can outperform CNNs.
To address this, the authors propose augmenting CNNs with additional regional texture channels, such as GLCM-based statistics, and introduce an “orthogonal convolution” that produces more orderless texture representations. With a relatively shallow seven-layer network combining these ideas, they report an average improvement of about eight and a half percentage points in accuracy on the Outex benchmark compared with strong deep baselines like GoogLeNet and ResNet. Their methodological recommendation is clear: match the architecture to the texture type and do not be afraid to blend handcrafted and learned features.
For makers, this has a direct echo. Smooth backgrounds with subtle paper grain behave like stationary textures; busy patterned fabrics behind your jewelry behave like non-stationary textures. A model tuned for one can stumble on the other. When you see an AI misclassify your soft, evenly lit ceramic as “plastic,” it may not be that it hates your glaze; it may simply be trained for more irregular, contrasty textures and missing the quiet, stationary subtleties.
Transformers, Ultrasonic Echoes, And The Inner Texture Of Metal
Texture is not only skin-deep. In materials science, microstructural texture—the arrangement and size of grains inside metal—directly affects strength, durability, and resistance to cracking. Researchers in ultrasonic nondestructive evaluation have turned to neural networks, including transformers, to read these hidden textures.
Two IEEE studies on ultrasonic texture recognition focus on estimating grain size from backscattered ultrasonic signals. Traditional approaches try to measure attenuation along the wave’s path, but because grain size changes with depth, direct measurement is tricky. Instead, these works analyze the statistical variations of backscattered energy as a function of depth in the Rayleigh scattering regime, where the ultrasonic wavelength is longer than the average grain size and highly sensitive to frequency and distribution.
Deep CNNs have already been used to estimate grain size from these ultrasonic images, but they can involve hundreds of millions of parameters and demand heavy computation. The first study proposes an Ultrasonic Texture Recognition Vision Transformer, or UTRV Transformer, adapted from vision transformers originally developed for language models. By relying on attention rather than convolutions, the transformer reduces training time and computational load while still recognizing grain-size patterns in ultrasonic C-scan images of steel blocks with different heat treatments.
Building on that, a second study introduces a data-efficient ultrasonic transformer, or DEUTR transformer, further optimized for fast training and deployment under computational constraints. The authors emphasize that transformer-based networks can outperform carefully designed deep CNNs in classification accuracy while remaining more efficient, which is crucial for real-time inspection systems.
Although most handmade gift studios do not run ultrasonic scanners, the principle is inspiring. Neural networks can learn to interpret textures that are far beyond human senses: internal grain, microcracks, subtle scattering patterns. When you choose a particular alloy for jewelry or a specific high-fired clay for ceramics, similar microstructural textures determine how the piece will age. These studies hint at future collaborations where AI helps you validate material integrity while you focus on the surface feel and story.

Texture Fields And Continuous 3D Appearances
Texture is also a 3D phenomenon. A carved wooden ornament has grain that wraps around edges; a hand-painted mug has brush strokes that curve with its shape. Conventional digital texture maps flatten this onto two-dimensional images wrapped onto 3D models, often introducing seams and resolution limits.
A paper from an ICCV conference introduces texture fields as a different approach. Instead of storing colors in pixels, a texture field is a continuous function implemented by a neural network that maps 3D coordinates (and optionally viewing direction) plus a latent code to RGB color values. An image encoder extracts a latent representation from one or more posed RGB views, and a texture-field decoder predicts color at arbitrary points on or around a shape, often paired with a separate implicit field for geometry such as an occupancy network.
Training uses differentiable rendering: the system samples points along camera rays, queries both geometry and texture fields, renders images, and compares them to ground truth using photometric losses. The key benefit is that memory use does not grow with resolution: voxel textures require memory proportional to the cube of the resolution, while a texture field uses a fixed-size network but can be queried at arbitrarily high resolution. This yields sharper details and consistent colors across views, avoiding seams common with UV texture maps.
For handcrafted products, texture fields hint at a future in which you could scan a single view of your hand-glazed vase and let an implicit neural representation imagine the full 3D texture in all directions. That could power customizable previews where customers rotate a personalized item online, seeing the same swirl of glaze or grain they would see in person.

Neural Networks Meeting Real Materials: Metals, Seeds, And Directionality
Several recent studies show how neural networks respond to real-world textures in contexts surprisingly close to craft and materials.
In the journal Metals, researchers compare traditional texture descriptors such as GLCM, local binary patterns, and textons with CNN-based transfer learning for material-related textures. In one case study with simulated grain textures generated by Voronoi diagrams, traditional features achieved about fifty-eight to sixty-five percent accuracy at distinguishing slightly different grain-size classes. Direct deep features from pretrained CNNs did not always outperform these baselines. However, when the networks were retrained on the domain-specific textures, performance improved: partially retrained networks reached at least around sixty-three percent accuracy, fully retrained networks at least about sixty-six percent, and the best (a fully retrained GoogLeNet) achieved about seventy-four percent. In another case study with ultrahigh carbon steel micrographs, partially retrained networks achieved around eighty-seven to ninety-one percent accuracy, while fully retrained models reached about ninety-seven percent. The authors conclude that for material textures, transfer learning works best when at least some layers are retrained and when data augmentation is used.
The Scientific Reports seed study, set in an agricultural context, is almost poetic in how it treats tiny grains as textured objects in motion. The authors recorded high-speed videos of wheat, barley, and rapeseed kernels traveling through pneumatic tubes, then converted frames into grayscale images. They extracted twenty Haralick texture descriptors from GLCMs and fed them into multilayer perceptrons with twenty input neurons, fifteen hidden neurons, and three outputs for the grain classes. Different air velocities and tube configurations were used to test robustness. The work shows that even in challenging, dynamic conditions, texture statistics alone can support reliable classification.
Texture direction is another important property, especially for materials like wood, brushed metal, twill weaves, or corrugated cardboard. A Sensors article presents a convolutional neural network approach to texture directionality detection. The authors generated synthetic bar-pattern images with various bar thicknesses and periods, rotated them through one hundred eighty angles, added different levels of Gaussian noise and blur, and cut them into over seven hundred thousand tiles. They designed twelve CNN architectures, both shallow and deep, and tested seven activation functions. Directionality was framed as a one hundred eighty-class classification problem, one class per degree, and predictions with maximum probability below a threshold of about zero point zero one one were considered “no meaningful direction.” Their results show that asymmetrical, unbounded activations like ELU and SELU work best in shallow networks, while symmetrical, bounded activations like tanh and softsign favor deep networks. Shallow networks are more robust to noise and mild blur, whereas deep networks slightly outperform at the highest blur levels.
If you create items where the direction of grain or brush strokes matters—say, personalized cutting boards or engraved metal bracelets—these findings are highly relevant. A neural network can be trained to check whether the engraving aligns with the grain, or to flag photos where motion blur or heavy noise have erased the material’s visual direction.
What This Means When You Photograph Or Digitize Handmade Textures
Translating all this into your day-to-day creative life, the key message is that neural networks are extremely sensitive to how texture statistics appear in your images. They notice contrast, repetition, and orientation far more than brand names or price tags.
Because many approaches, from Gabor filters to CNNs, rely on multi-scale, multi-orientation filters, the direction and sharpness of edges on your product photos matter a great deal. Strong directional light that creates harsh highlights on one side and deep shadows on the other can shift the apparent texture from “soft knit” to “high contrast ridges.” Gentle, even lighting tends to preserve the underlying material statistics without introducing artificial patterns.
Patch size also matters. Research on texture analysis emphasizes the choice of patch size and resolution as a practical consideration. If an image patch is too small, the network may see only a few stitches or grains and fail to recognize the larger pattern. If it is too large and includes multiple textures—your mug plus the wooden table plus a patterned cloth—it becomes harder to assign a single label, and the background might dominate. Cropping so that each patch contains mostly one texture, as in the Sensors directionality paper where images were sliced into sixty-four by sixty-four pixel tiles, makes classification easier.
Backgrounds deserve special care. From the perspective of a CNN trained to detect regular vs. irregular textures, a highly patterned backdrop behind your jewelry is not “just background.” It is a competing regular or stochastic texture. The model may latch onto the repeating chevrons of a blanket rather than the subtle brushed finish of your pendant. Whenever your goal is to showcase the material of the object, treat the background as a supporting actor with a simpler, quieter texture.
Multi-angle views are also powerful. The 2D vs. 3D segmentation study in PubMed Central showed that networks seeing motion-based occlusion from different viewpoints could learn richer cues about layered textures than those seeing a single static view. For your shop, multiple photos from slightly different angles do something similar for humans and algorithms alike: they reveal how the texture responds to light and self-occlusion, whether it is a glossy glaze, a matte fabric, or a faceted gemstone.
To bring this together, consider the following table as a practical bridge between creative choices and the neural view.
Studio choice |
What the network actually sees |
Practical implication for photos and scans |
Soft, even lighting on a knitted scarf |
Stable local intensity differences and gentle gradients |
Helps models capture “soft, stationary” texture without artifacts |
Busy patterned blanket behind a ceramic mug |
Mixture of regular and irregular textures competing in one patch |
Risk that background dominates; consider simpler backdrops |
Close-up macro shot of brush strokes on a bowl |
High-resolution, multi-scale edges and orientations |
Great for models that rely on early-layer filters and LBP-like cues |
Heavily compressed or blurred product photo |
Smoothed-out edges and lost fine structure |
Deep models may struggle; shallow CNNs and GLCM may fare better |
Multiple angles around a carved ornament |
Different configurations of the same underlying 3D texture statistics |
Supports both human and AI recognition of consistent material |
Interpreting Texture Decisions: When To Trust The Model And When To Ask For A Human Eye
Even with a good understanding of how networks see texture, decisions are not perfect. The classical LBP plus SVM exercise from BYU explicitly recommends checking confusion matrices after adding new texture classes, watching which materials are often mistaken for each other. If “stone” and “concrete” are frequently swapped, it might be because their LBP histograms are too similar under your lighting and scale.
In the Sensors directionality work, the authors recognize that a model always produces some angle, even for nearly uniform or homogeneous images. They deal with this by discarding predictions when the maximum probability falls below about zero point zero one one, roughly double the average probability in their one hundred eighty-class system. This idea generalizes: if a model seems unsure, treat its decision as “no clear texture,” not as a definitive mislabel.
Error metrics also need to respect texture symmetries. For directional patterns, a stripe at ten degrees and one at one hundred ninety degrees are essentially the same orientation. The Sensors paper uses an error definition based on the arc cosine of the absolute value of the cosine of the angle difference, which correctly accounts for one hundred eighty-degree periodicity. For creative workflows, it reminds us that some differences the model cares about are irrelevant to human perception, and vice versa.
Trust grows when you know what a model is trained on. The Metals study shows that direct deep features from off-the-shelf CNNs may not outperform classical descriptors on specialized textures, but retrained models often do. If you intend to use AI to help sort wooden blanks by grain quality or verify that a batch of hand-dyed skeins matches a reference texture, make sure the model has seen examples from your materials, not just generic images.
Finally, interpretability tools such as activation map visualizations, discussed in general overviews of deep texture analysis, can reveal which regions and patterns a network attends to. If you see that the network bases its “handmade ceramic” decision mostly on the bokeh of fairy lights in the background, you know it is learning the wrong cue. Adjust your lighting or cropping, retrain, and try again.
FAQ: Texture-Aware AI For Makers And Gift Designers
Can a neural network really understand softness, warmth, or “handmade” character?
Not in the human sense. Networks operate on patterns of light and shadow. However, studies in PeerJ Computer Science, Metals, and Scientific Reports show that they can reliably discriminate between surfaces with different microstructures, grain sizes, or weave regularity based solely on texture statistics. Softness and “handmade-ness” often correlate with visual cues such as irregular brush strokes, gentle gradients, or certain frequency patterns. The model picks up those cues, but it does not feel the comfort of a quilt. That part still belongs to you and your customers.
If I use AI to generate mockups of my handmade products, will the textures look authentic?
It depends on the underlying texture representation. Systems inspired by the Heeger–Bergen algorithm or by texture fields can produce remarkably convincing textures, especially when trained on real examples of your materials. Yet limitations remain: classical histogram-based methods struggle with long-range structure, and even texture fields may hallucinate details in occluded regions. For sentimental gifts, the safest route is often to use AI as a sketching partner, then photograph the real piece so the true surface story comes through.
Is deep learning always better than classical texture features for my use case?
Research summarized across Metals, the Outex-focused work on orthogonal convolution, and the ultrasonic transformer papers suggests a more nuanced answer. Deep CNNs and transformers usually win when textures are complex, non-stationary, or embedded in noisy environments, and when you have enough labeled examples and can retrain at least part of the model. Classical features like LBP and GLCM can still outperform deep models on certain stationary textures or when data are very limited. For a small studio, a hybrid approach—simple descriptors for quick checks, plus a retrained deep model for high-stakes or subtle distinctions—often makes practical sense.
In the end, neural networks do not replace your eye or your heart; they extend them. They see what you already sense in your favorite materials, but at a different scale and with different strengths. When you understand how they read grain, weave, shine, and roughness, you can photograph and design your handmade and personalized pieces in ways that honor both the algorithm and the emotion. Think of the network as a meticulous apprentice who notices every stitch, every groove, every fleck in your glaze, while you hold the bigger story of why this gift will matter in someone’s life.
References
- https://visionbook.mit.edu/textures.html
- https://ecasp.ece.iit.edu/publications/2012-present/2020-13.pdf
- https://pmc.ncbi.nlm.nih.gov/articles/PMC11041941/
- https://authors.library.caltech.edu/records/wmrd8-0zx07/files/636-remote-sensing-image-analysis-via-a-texture-classification-neural-network.pdf
- https://dl.acm.org/doi/10.1007/s11042-020-09520-2
- https://ieeexplore.ieee.org/document/9491908/
- https://www.cvlibs.net/publications/Oechsle2019ICCV.pdf
- https://scribe.rip/aa627c8bb133
- https://www.nature.com/articles/s41598-022-23838-x
- https://scispace.com/pdf/texture-classification-with-neural-networks-1w6xazcexn.pdf
As the Senior Creative Curator at myArtsyGift, Sophie Bennett combines her background in Fine Arts with a passion for emotional storytelling. With over 10 years of experience in artisanal design and gift psychology, Sophie helps readers navigate the world of customizable presents. She believes that the best gifts aren't just bought—they are designed with heart. Whether you are looking for unique handcrafted pieces or tips on sentimental occasion planning, Sophie’s expert guides ensure your gift is as unforgettable as the moment it celebrates.
